我有 2 个数据框,想合并,但如果有相同的值必须将其替换为 NA,例如
df1=data.frame(x1=c(1,1,1,2,2,2,2),x2=c("a","a","b","b","c","c","c"),x3=c("t","u","v","w","x","y","z"),x4=c("apple","apple","mango","mango","mango","mango","mango"))
df2=data.frame(x1=c(1,1,1,1,2,2,2,2),x2=c("a","a","a","b","b","b","c","c"),x3=c("t","u","u","v","w","x","y","z"),x5=c("apple A","apple A","apple B","mango A","mango B","mango A","mango A","mango A"),x6=c(10,10,20,10,10,30,30,30))
我的预期df
df3=data.frame(x1=c(1,1,1,1,2,2,2,2),x2=c("a","a","a","b","b","b","c","c"),x3=c("t","u","u","v","w","x","y","z"),x4=c("apple","apple","apple","mango","mango","mango","mango","mango"),x5=c("apple A","apple A","apple B","mango A","mango B","mango A","mango A","mango A"),x6=c(10,NA,20,10,10,30,30,NA))
合并 df1 和 df2,但如果 df2 在 x1、x2 和 x5 上有重复值,则 x6 为 NA
考虑做:
is.na(df2$x6) <- duplicated(df2[c('x1' , 'x2', 'x5', 'x6')])
merge(df1, df2, all.x = TRUE)
x1 x2 x3 x4 x5 x6
1 1 a t apple apple A 10
2 1 a u apple apple A NA
3 1 a u apple apple B 20
4 1 b v mango mango A 10
5 2 b w mango mango B 10
6 2 c x mango <NA> NA
7 2 c y mango mango A 30
8 2 c z mango mango A NA