data.table 通过向量回收创建动态列名

问题描述 投票:0回答:1

我想在

data.table
中复制
dplyr
的行为,通过
group_by
reframe
扩展行。我可以通过自己设置列名称来完成此操作,但如果我设置动态列名称(即基于外部向量),我不知道该怎么做。下面的例子:

library(data.table)
library(dplyr)

iris <- as.data.table(iris)

# Dplyr
varname <- c("new_var")
var_levels <- c("level1", "level2")
iris %>% 
  group_by(pick(everything())) %>% 
  reframe({{varname}} := var_levels) 
#> # A tibble: 298 × 6
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species new_var
#>           <dbl>       <dbl>        <dbl>       <dbl> <fct>   <chr>  
#>  1          4.3         3            1.1         0.1 setosa  level1 
#>  2          4.3         3            1.1         0.1 setosa  level2 
#>  3          4.4         2.9          1.4         0.2 setosa  level1 
#>  4          4.4         2.9          1.4         0.2 setosa  level2 
#>  5          4.4         3            1.3         0.2 setosa  level1 
#>  6          4.4         3            1.3         0.2 setosa  level2 
#>  7          4.4         3.2          1.3         0.2 setosa  level1 
#>  8          4.4         3.2          1.3         0.2 setosa  level2 
#>  9          4.5         2.3          1.3         0.3 setosa  level1 
#> 10          4.5         2.3          1.3         0.3 setosa  level2 
#> # ℹ 288 more rows


# Data.table:
iris[, .(new_var = var_levels) , keyby = names(iris)]
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width   Species new_var
#>   1:          4.3         3.0          1.1         0.1    setosa  level1
#>   2:          4.3         3.0          1.1         0.1    setosa  level2
#>   3:          4.4         2.9          1.4         0.2    setosa  level1
#>   4:          4.4         2.9          1.4         0.2    setosa  level2
#>   5:          4.4         3.0          1.3         0.2    setosa  level1
#>  ---                                                                    
#> 294:          7.7         3.0          6.1         2.3 virginica  level2
#> 295:          7.7         3.8          6.7         2.2 virginica  level1
#> 296:          7.7         3.8          6.7         2.2 virginica  level2
#> 297:          7.9         3.8          6.4         2.0 virginica  level1
#> 298:          7.9         3.8          6.4         2.0 virginica  level2


# Dynamic data.table?
iris[, (varname) := var_levels, keyby= names(iris)] 
#> Error in `[.data.table`(iris, , `:=`((varname), var_levels), keyby = names(iris)): Supplied 2 items to be assigned to group 1 of size 1 in column 'new_var'. The RHS length must either be 1 (single values are ok) or match the LHS length exactly. If you wish to 'recycle' the RHS please use rep() explicitly to make this intent clear to readers of your code.

创建于 2023-11-22,使用 reprex v2.0.2

我能想到的唯一解决方案是:

iris[, .(varname = var_levels) , keyby = names(iris)]
iris <- iris %>% rename({{varname}} := varname)
r dplyr data.table
1个回答
0
投票

data.table
的说法,由于您要按所有预先存在的列进行分组,因此我们不需要
:=
赋值,我们希望返回单个列并让
data.table
保留分组的列。您从
.(varname = var_levels)
步骤开始走上了正确的道路,我们只需要一种动态命名列的方法。我们可以在
setNames
上使用
list(var_levels)
来做到这一点。

out <- irisDT[, setNames(list(var_levels), varname), keyby = names(irisDT)]
irisDT %>% 
  group_by(pick(everything())) %>% 
  reframe({{varname}} := var_levels) %>%
  all.equal(out, check.attributes = FALSE)
# [1] TRUE
© www.soinside.com 2019 - 2024. All rights reserved.