如何用 R 中的另一个数据框替换数据框

问题描述 投票:0回答:1

我想用 df2 替换 df1 数据,df2 是像 df1 这样的数据 例子

df1 <- data.frame(
  name = c(
    "A. MAHJUM-61365",
    "A. MAHJUM-61365. MAHJUM-61365",
    "A. RIZAL. AD-11002795",
    "A. RIZAL. AD-11002795. RIZAL. AD-11002795",
    "ABD. KADIR-60447",
    "ABD. KADIR-60447ABD. KADIR-60447",
    "ABD. KAHAR-62551",
    "ABD. RASYID DS-11002082",
    "ABDREAS APUNG @SANY",
    "ABDUL AZIS @HYUNDAY",
    "ABDUL AZIZ @HYUNDAI",
    "ABDUL AZIZ@HYUNDAI"
  )

df2 是

df2 <- data.frame(
  name = c(
    "A. MAHJUM-61365",
    "A. RIZAL. AD-11002795",
    "ABD. KADIR-60447",
    "ABD. KAHAR-62551",
    "ABD. RASYID DS-11002082",
    "ABDREAS APUNG @SANY",
    "ABDUL AZIS @HYUNDAY"
  )

如果 df1 看起来像 df2,df1 将替换为 df2

r dplyr rstudio stringr stringdist
1个回答
0
投票

因为是子串匹配,我们可以使用

fuzzyjoin

library(dplyr)
library(fuzzyjoin)
regex_left_join(df1, df2, by = 'name') %>% 
  transmute(name = coalesce(name.y, name.x))

或使用基于距离的方法

 stringdist_left_join(df1, df2, by = 'name') %>% 
   transmute(name = coalesce(name.y, name.x))
© www.soinside.com 2019 - 2024. All rights reserved.