我想在
data.table
中复制 dplyr
的行为,通过 group_by
和 reframe
扩展行。我可以通过自己设置列名称来完成此操作,但如果我设置动态列名称(即基于外部向量),我不知道该怎么做。下面的例子:
library(data.table)
library(dplyr)
iris <- as.data.table(iris)
# Dplyr
varname <- c("new_var")
var_levels <- c("level1", "level2")
iris %>%
group_by(pick(everything())) %>%
reframe({{varname}} := var_levels)
#> # A tibble: 298 × 6
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species new_var
#> <dbl> <dbl> <dbl> <dbl> <fct> <chr>
#> 1 4.3 3 1.1 0.1 setosa level1
#> 2 4.3 3 1.1 0.1 setosa level2
#> 3 4.4 2.9 1.4 0.2 setosa level1
#> 4 4.4 2.9 1.4 0.2 setosa level2
#> 5 4.4 3 1.3 0.2 setosa level1
#> 6 4.4 3 1.3 0.2 setosa level2
#> 7 4.4 3.2 1.3 0.2 setosa level1
#> 8 4.4 3.2 1.3 0.2 setosa level2
#> 9 4.5 2.3 1.3 0.3 setosa level1
#> 10 4.5 2.3 1.3 0.3 setosa level2
#> # ℹ 288 more rows
# Data.table:
iris[, .(new_var = var_levels) , keyby = names(iris)]
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species new_var
#> 1: 4.3 3.0 1.1 0.1 setosa level1
#> 2: 4.3 3.0 1.1 0.1 setosa level2
#> 3: 4.4 2.9 1.4 0.2 setosa level1
#> 4: 4.4 2.9 1.4 0.2 setosa level2
#> 5: 4.4 3.0 1.3 0.2 setosa level1
#> ---
#> 294: 7.7 3.0 6.1 2.3 virginica level2
#> 295: 7.7 3.8 6.7 2.2 virginica level1
#> 296: 7.7 3.8 6.7 2.2 virginica level2
#> 297: 7.9 3.8 6.4 2.0 virginica level1
#> 298: 7.9 3.8 6.4 2.0 virginica level2
# Dynamic data.table?
iris[, (varname) := var_levels, keyby= names(iris)]
#> Error in `[.data.table`(iris, , `:=`((varname), var_levels), keyby = names(iris)): Supplied 2 items to be assigned to group 1 of size 1 in column 'new_var'. The RHS length must either be 1 (single values are ok) or match the LHS length exactly. If you wish to 'recycle' the RHS please use rep() explicitly to make this intent clear to readers of your code.
创建于 2023-11-22,使用 reprex v2.0.2
我能想到的唯一解决方案是:
iris[, .(varname = var_levels) , keyby = names(iris)]
iris <- iris %>% rename({{varname}} := varname)
用
data.table
的说法,由于您要按所有预先存在的列进行分组,因此我们不需要 :=
赋值,我们希望返回单个列并让 data.table
保留分组的列。您从 .(varname = var_levels)
步骤开始走上了正确的道路,我们只需要一种动态命名列的方法。我们可以在 setNames
上使用 list(var_levels)
来做到这一点。
out <- irisDT[, setNames(list(var_levels), varname), keyby = names(irisDT)]
irisDT %>%
group_by(pick(everything())) %>%
reframe({{varname}} := var_levels) %>%
all.equal(out, check.attributes = FALSE)
# [1] TRUE