用字符串指定 data.table 列

Question

考虑以下最小工作示例：

library(magrittr) # for the %>% pipe 
library(data.table) 

# test data.table contains common_column and two others
test_dt <- data.table(test_column_one = c(1, 2, 3), test_column_two = c("x","y","z"), common_column = c("ID1", "ID2", "ID3") ) 

# some other data.table that contains common_column
other_dt <- data.table( additional_info = c("US", "US", "GB"), common_column = c("ID1", "ID2", "ID3")) 

example_function <- function(dt_column){
  # does some things on the data tables based on the column parameter passed
  merged_dt <- merge(other_dt, test_dt[,.(common_column, dt_column)], by = "common_column") %>%
    .[order(dt_column),] # order by the dt_column
  return(merged_dt)
} 

# calling the example function
example_function(test_dt$test_column_one)

如何将代码修改为：

能够指定一个以列名作为参数的字符串
能够传递带有列名称的向量或字符串列表

我想避免 for 循环并尽可能利用优化的 data.table 语法。

我尝试使用

unlist()

以及特定于 data.table 的

..

语法，但不知怎的，我总是收到奇怪的错误消息，并且不确定如何继续。

Answer 1

您可以创建一个列向量以合并到函数中以与

..

语法一起使用。另外，当您想利用

data.table

效率时，请使用

set*

函数（在本例中为

setorderv()

）进行就地修改，而不是通过管道创建副本。

example_function <- function(dt_column, dt1 = other_dt, dt2 = test_dt) {
    cols_to_merge <- c("common_column", dt_column)
    merged_dt <- merge(
        dt1,
        dt2[, ..cols_to_merge],
        by = "common_column"
    )
    setorderv(merged_dt, dt_column)
    # order by the dt_column
    return(merged_dt)
}

example_function("test_column_one")
#    common_column additional_info test_column_one
#           <char>          <char>           <num>
# 1:           ID1              US               1
# 2:           ID2              US               2
# 3:           ID3              GB               3

用字符串指定 data.table 列

问题描述投票：0回答：1

1个回答

最新问题

用字符串指定 data.table 列

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1