我正在寻找构建一个函数,该函数可以在任意数量的列上左连接两个数据表。但是,当将省略号传递到
on = .()
条件时,我遇到了问题。
使用数据
library(data.table)
library(lubridate)
df = data.table(a = c(1,1,2,2,3,3,4,4,5,5),
b = c(4,5,6,7,8,9,10,11,12,13),
c = c(1,1,1,2,3,3,3,4,4,4),
mnd = as.Date(lubridate::ym(c(200701, 200702, 200704, 200705, 200701, 200702, 200701, 200703, 200704, 200705))))
dat = data.table(id = c(1,1,1,2,3,3,3,4,4,4),
value = c(rep(30, 2), rep(25, 5), rep(20, 3)),
date = as.Date(lubridate::ym(c(200701, 200702, 200704, 200705, 200701, 200702, 200701, 200703, 200704, 200705))))
如果我直接进行左连接
df[dat, on = .(mnd = date, c = id), nomatch = NULL]
一切正常。但是,使用该功能
dtable_func = function(input_data, input_data_2, ...) {
xx = input_data[input_data_2,
on = .(...), # how to join the two tables?
nomatch = NULL]
return(xx)
}
我收到错误
Error in colnamesInt(x, names(on), check_dups = FALSE) : argument specifying columns specify non existing column(s): cols[1]='...'
,我无法理解。我也看了this,但没有帮助。
所以我想知道如何通过函数将省略号传递到 data.table 中?
利用 NSE:
dtable_func <- function(input_data, input_data_2, ...) {
by_vals <- as.call(c(quote(.), substitute(...())))
input_data[input_data_2, on = eval(by_vals), nomatch = NULL]
}
dtable_func(df, dat, mnd = date, c = id)
a b c mnd value
1: 1 4 1 2007-01-01 30
2: 1 5 1 2007-02-01 30
3: 2 6 1 2007-04-01 25
4: 2 7 2 2007-05-01 25
5: 3 8 3 2007-01-01 25
6: 4 10 3 2007-01-01 25
7: 3 9 3 2007-02-01 25
8: 3 8 3 2007-01-01 25
9: 4 10 3 2007-01-01 25
10: 4 11 4 2007-03-01 20
11: 5 12 4 2007-04-01 20
12: 5 13 4 2007-05-01 20