按列合并这些列的不同值

问题描述 投票:1回答:1

我有两个data.tables。如果dfB中的dfA对应于year中的dfA 一年之前,则我想将第二个data.table year的行与第一个dfB合并。

作为示例,dfB的第一行将与dfA的第一行合并,因为dfB的年份为2009,比dfA的年份为2010。

  library(data.table)
  dfA <- fread("
  A   B   C   D   E   F   G   Z   iso   year   matchcode
  1   0   1   1   1   0   1   0   NLD   2010   NLD2010
  2   1   0   0   0   1   0   1   NLD   2014   NLD2014
  3   0   0   0   1   1   0   0   AUS   2010   AUS2010
  4   1   0   1   0   0   1   0   AUS   2006   AUS2006
  5   0   1   0   1   0   1   1   USA   2008   USA2008
  6   0   0   1   0   0   0   1   USA   2010   USA2010
  7   0   1   0   1   0   0   0   USA   2012   USA2012
  8   1   0   1   0   0   1   0   BLG   2008   BLG2008
  9   0   1   0   1   1   0   1   BEL   2008   BEL2008
  10  1   0   1   0   0   1   0   BEL   2010   BEL2010
  11  0   1   1   1   0   1   0   NLD   2010   NLD2010
  12  1   0   0   0   1   0   1   NLD   2014   NLD2014
  13  0   0   0   1   1   0   0   AUS   2010   AUS2010
  14  1   0   1   0   0   1   0   AUS   2006   AUS2006
  15  0   1   0   1   0   1   1   USA   2008   USA2008
  16  0   0   1   0   0   0   1   USA   2010   USA2010
  17  0   1   0   1   0   0   0   USA   2012   USA2012
  18  1   0   1   0   0   1   0   BLG   2008   BLG2008
  19  0   1   0   1   1   0   1   BEL   2008   BEL2008
  20  1   0   1   0   0   1   0   BEL   2010   BEL2010",
  header = TRUE)

  dfB <- fread("
  A   B   C   D   H   I   J   K   iso   year   matchcode
  1   0   1   1   1   0   1   0   NLD   2009   NLD2009
  2   1   0   0   0   1   0   1   NLD   2014   NLD2014
  3   0   0   0   1   1   0   0   AUS   2011   AUS2011
  4   1   0   1   0   0   1   0   AUS   2007   AUS2007
  5   0   1   0   1   0   1   1   USA   2007   USA2007
  6   0   0   1   0   0   0   1   USA   2010   USA2010
  7   0   1   0   1   0   0   0   USA   2013   USA2013
  8   1   0   1   0   0   1   0   BLG   2007   BLG2007
  9   0   1   0   1   1   0   1   BEL   2009   BEL2009
  10   1   0   1   0   0   1   0  BEL   2012   BEL2012",
  header = TRUE)

我想尝试:

dfA <- merge(dfA , dfB, on =.(iso, year == year-1), all.x = TRUE, allow.cartesian=FALSE)

但是那会在年份上产生匹配,这不是我想要的。

我相信roll也会尝试找到最接近的匹配项。

我应该如何编写此合并?

我有两个data.tables。如果来自dfA的年份对应于dfB中的前一年,我想将第二个data.table dfB的行与第一个dfA合并。例如,dfB ...

r merge data.table
1个回答
0
投票

有点杂乱,但是尝试:

dfB[dfA[,c(.SD,.(year1=year-1))],
    on=.(A,B,C,D,iso,year == year1)]
     A B C D  H  I  J  K iso year matchcode E F G Z i.year i.matchcode
 1:  1 0 1 1  1  0  1  0 NLD 2009   NLD2009 1 0 1 0   2010     NLD2010
 2:  2 1 0 0 NA NA NA NA NLD 2013      <NA> 0 1 0 1   2014     NLD2014
 3:  3 0 0 0 NA NA NA NA AUS 2009      <NA> 1 1 0 0   2010     AUS2010
 4:  4 1 0 1 NA NA NA NA AUS 2005      <NA> 0 0 1 0   2006     AUS2006
 5:  5 0 1 0  1  0  1  1 USA 2007   USA2007 1 0 1 1   2008     USA2008
 6:  6 0 0 1 NA NA NA NA USA 2009      <NA> 0 0 0 1   2010     USA2010
 7:  7 0 1 0 NA NA NA NA USA 2011      <NA> 1 0 0 0   2012     USA2012
 8:  8 1 0 1  0  0  1  0 BLG 2007   BLG2007 0 0 1 0   2008     BLG2008
 9:  9 0 1 0 NA NA NA NA BEL 2007      <NA> 1 1 0 1   2008     BEL2008
10: 10 1 0 1 NA NA NA NA BEL 2009      <NA> 0 0 1 0   2010     BEL2010
11: 11 0 1 1 NA NA NA NA NLD 2009      <NA> 1 0 1 0   2010     NLD2010
12: 12 1 0 0 NA NA NA NA NLD 2013      <NA> 0 1 0 1   2014     NLD2014
13: 13 0 0 0 NA NA NA NA AUS 2009      <NA> 1 1 0 0   2010     AUS2010
14: 14 1 0 1 NA NA NA NA AUS 2005      <NA> 0 0 1 0   2006     AUS2006
15: 15 0 1 0 NA NA NA NA USA 2007      <NA> 1 0 1 1   2008     USA2008
16: 16 0 0 1 NA NA NA NA USA 2009      <NA> 0 0 0 1   2010     USA2010
17: 17 0 1 0 NA NA NA NA USA 2011      <NA> 1 0 0 0   2012     USA2012
18: 18 1 0 1 NA NA NA NA BLG 2007      <NA> 0 0 1 0   2008     BLG2008
19: 19 0 1 0 NA NA NA NA BEL 2007      <NA> 1 1 0 1   2008     BEL2008
20: 20 1 0 1 NA NA NA NA BEL 2009      <NA> 0 0 1 0   2010     BEL2010
© www.soinside.com 2019 - 2024. All rights reserved.