用一个第二数据表中的观察值填充一个数据表中的NA值

问题描述 投票:2回答:1

我不敢相信我很难找到解决这个问题的方法:我有两个数据表,它们的行和列看起来像这样:

Country <- c("FRA", "FRA", "DEU", "DEU", "CHE", "CHE")
Year <- c(2010, 2020, 2010, 2020, 2010, 2020)
acctm <- c(20, 30, 10, NA, 20, NA)
acctf <- c(20, NA, 15, NA, 40, NA)

dt1 <- data.table(Country, Year, acctm, acctf)

   Country Year acctm acctf
1      FRA 2010    20    20
2      FRA 2020    30    NA
3      DEU 2010    10    15
4      DEU 2020    NA    NA
5      CHE 2010    20    40
6      CHE 2020    NA    NA

Country <- c("FRA", "FRA", "DEU", "DEU", "CHE", "CHE")
Year <- c(2010, 2020, 2010, 2020, 2010, 2020)
acctm <- c(1, 1, 1, 60, 1, 70)
acctf <- c(1, 60, 1, 80, 1, 100)

dt2 <- data.table(Country, Year, acctm, acctf)

   Country Year acctm acctf
1      FRA 2010    1     1
2      FRA 2020    2    60
3      DEU 2010    1     1
4      DEU 2020    60   80
5      CHE 2010    1     2
6      CHE 2020    70  100

我需要创建一个新的数据表,用NA中对应的国家/地区/变量匹配的值替换dt1中的dt2值,从而产生一个如下所示的表:

   Country Year acctm acctf
1      FRA 2010    20    20
2      FRA 2020    30    60
3      DEU 2010    10    15
4      DEU 2020    60    80
5      CHE 2010    20    40
6      CHE 2020    70   100
r merge datatable na
1个回答
3
投票

我们可以通过在[国家/地区,年份]列中加入on来完成此操作

library(data.table)
nm1 <- names(dt1)[3:4]
nm2 <- paste0("i.", nm1)
dt3 <- copy(dt1)
dt3[dt2, (nm1) := Map(function(x, y) 
   fifelse(is.na(x), y, x), mget(nm1), mget(nm2)), on = .(Country, Year)]
dt3
#   Country Year acctm acctf
#1:     FRA 2010    20    20
#2:     FRA 2020    30    60
#3:     DEU 2010    10    15
#4:     DEU 2020    60    80
#5:     CHE 2010    20    40
#6:     CHE 2020    70   100

或者为了使其紧凑,请使用fcoalesce中的data.table(来自@IceCreamToucan的评论)

dt3[dt2,  (nm1) := Map(fcoalesce, mget(nm1), mget(nm2)), on = .(Country, Year)]

如果数据集的维度相同,并且“国家/地区”,“年份”具有相同的值,那么另一个选择是

library(purrr)
library(dplyr)
list(dt1[, .(acctm, acctf)], dt2[, .(acctm, acctf)]) %>% 
      reduce(coalesce) %>%
      bind_cols(dt1[, .(Country, Year)], .)

1
投票

如果订购的方式完全相同,则可以这样做

as.data.table(Map(function(x, y) ifelse(is.na(x), y, x), dt1, dt2))

#    Country Year acctm acctf
# 1:     FRA 2010    20    20
# 2:     FRA 2020    30    60
# 3:     DEU 2010    10    15
# 4:     DEU 2020    60    80
# 5:     CHE 2010    20    40
# 6:     CHE 2020    70   100
© www.soinside.com 2019 - 2024. All rights reserved.