我是 dplyr 的 join 函数的新手。我有两张桌子想要合并
df1 <- data.frame(name = c('Fred', 'Fred', 'Fred', 'Sasha', 'Sasha', 'Sasha'),
year = c('2018-19', '2019-20', '2020-21', '2018-19', '2019-20', '2020-21'),
outcome1 = 1:6)
df2 <- data.frame(name = c('Sasha', 'Sasha', 'Sasha', 'Rebecca', 'Rebecca', 'Rebecca'),
year = c('2019-20', '2020-21', '2021-22', '2019-20', '2020-21', '2021-22'),
outcome2 = 2:7)
使得结果表看起来像这样。
df3 <- data.frame(name = c('Fred', 'Fred', 'Fred', 'Sasha', 'Sasha', 'Sasha', 'Sasha'),
year = c('2018-19', '2019-20', '2020-21', '2018-19', '2019-20', '2020-21', '2021-22'),
outcome1 = c(1:6, NA),
outcome2 = c(NA, NA, NA, NA, 2, 3, 4))
我找到了一个解决方案是
full_join(df1, df2, by = c('name', 'year')
,然后从结果表中过滤掉Rebecca%>% filter(name %in% df1$name)
,但是有没有一个函数可以让我将df1和df2扔进去以获得df3?
您可以使用
subset
,通过 merge
执行完全连接操作
> subset(merge(df1, df2, all = TRUE), name %in% df1$name)
name year outcome1 outcome2
1 Fred 2018-19 1 NA
2 Fred 2019-20 2 NA
3 Fred 2020-21 3 NA
7 Sasha 2018-19 4 NA
8 Sasha 2019-20 5 2
9 Sasha 2020-21 6 3
10 Sasha 2021-22 NA 4