R-列匹配,新建一列与另一列对应的值[重复]。

问题描述 投票:0回答:1

我有两个数据框架。

df1<- data.frame(place=c("KARACA ADANA","ASIL BOLU","GAZIANTEP","YUKARI/MERSIN"))
df2<- data.frame(city=c("ADANA","BOLU","ANTEP","MERSIN"), neighbor=c("KARACA","ASIL","GAZI","YUKARI"))

我需要匹配df1$place和df2$neighbor两列。如果df1$place包含了df2$neighbor中的单词,它应该通过复制df2$city的对应值来创建一个新的列到df1$newcol中。

df1$newcol <- data.frame(place=c("KARACA ADANA","ASIL BOLU","GAZIANTEP","YUKARI/MERSIN") ,city=c("ADANA","BOLU","ANTEP","MERSIN"))
r match string-matching
1个回答
1
投票

这里有一个方法,有 sapply 从基础R。

如果你只想匹配完整的单词,你可以使用一个正则表达式。\\b 寻找一个词的边界。

ind <- unlist(sapply(df2$neighbor, function(x) grep(paste0("\\b",x,"\\b"),df1$place)))
ind2 <- rep(1:length(df2$neighbor),
            times = sapply(df2$neighbor, function(x) length(grep(paste0("\\b",x,"\\b"),df1$place))))
df1$newcol <- NA
df1$newcol[ind] <- as.character(df2$city[ind2])
df1
#          place newcol
#1  KARACA ADANA  ADANA
#2     ASIL BOLU   BOLU
#3     GAZIANTEP   <NA>
#4 YUKARI/MERSIN MERSIN
#5 YUKARI/MERSIN MERSIN
#6     GAZIANTEP   <NA>
#7     ASIL BOLU   BOLU
#8  KARACA ADANA  ADANA

示例数据

df1<- data.frame(place=c(c("KARACA ADANA","ASIL BOLU","GAZIANTEP","YUKARI/MERSIN"),
                         rev(c("KARACA ADANA","ASIL BOLU","GAZIANTEP","YUKARI/MERSIN"))))

0
投票

试着

library(tidyverse)
df1 %>% 
  rowwise() %>% 
  mutate(out = df2$city[str_which(place, df2$city)])
© www.soinside.com 2019 - 2024. All rights reserved.