我有一个名为a
的数据框架,其结构类似于:-
a <- data.frame(X1=c("A", "B", "C", "A", "C", "D"),
X2=c("B", "C", "D", "A", "B", "A"),
X3=c("C", "D", "A", "B", "A", "B")
)
还有另一套是:-
b <- data.frame(Xn=c("A", "B", "C", "D"),
Feature=c("some", "more", "what", "why"))
我想将集合Features
中的所有b
添加到集合a
中,以便X1
,X2
和X3
在集合a
中具有其对应的功能列。换句话说,集合a
中的列变为:-
colnames(a) <- c("X1", "X2", "X3", "Features1", "Features2", "Features3")
我该如何在for循环中使用left_join?
在基数R中,我们可以将unlist
a
数据帧和match
与b$Xn
一起获得相应的Feature
值。我们可以将此数据帧cbind
转换为原始数据帧以获得最终答案。
temp <- a
temp[] <- b$Feature[match(unlist(temp), b$Xn)]
names(temp) <- paste0('Feature', seq_along(temp))
cbind(a, temp)
# X1 X2 X3 Feature1 Feature2 Feature3
#1 A B C some more what
#2 B C D more what why
#3 C D A what why some
#4 A A B some some more
#5 C B A what more some
#6 D A B why some more
在tidyverse
中,我们可以获取长格式的数据,将数据合并并恢复为宽格式。
library(dplyr)
library(tidyr)
a %>%
mutate(row = row_number()) %>%
pivot_longer(cols = -row) %>%
left_join(b, by = c('value' = 'Xn')) %>%
select(-value) %>%
pivot_wider(names_from = name, values_from = Feature) %>%
select(-row) %>%
rename_all(~paste0('Feature', seq_along(.))) %>%
bind_cols(a, .)
我们可以使用map
生成具有适当名称的数据帧列表以进行连接,然后使用reduce
进行连续的连接。例如:
library(tidyverse)
c(list(a),
map(1:3, ~b %>% set_names(paste0(c("X","Feature"), .x)))) %>%
reduce(left_join)
X1 X2 X3 Feature1 Feature2 Feature3 1 A B C some more what 2 B C D more what why 3 C D A what why some 4 A A B some some more 5 C B A what more some 6 D A B why some more
为了解压,map
语句使用b
创建我们要加入原始数据帧a
的三个数据帧的列表。
map(1:3, ~b %>% set_names(paste0(c("X","Feature"), .x)))
[[1]] X1 Feature1 1 A some 2 B more 3 C what 4 D why 5 A zzz [[2]] X2 Feature2 1 A some 2 B more 3 C what 4 D why 5 A zzz [[3]] X3 Feature3 1 A some 2 B more 3 C what 4 D why 5 A zzz
[我们使用c
函数将原始数据帧a
放在我们刚刚创建的列表的前面,从而为我们提供了四个数据帧的列表。
c(list(a),
map(1:3, ~b %>% set_names(paste0(c("X","Feature"), .x))))
现在我们要连续地将map
创建的三个数据帧连接到a
。这就是reduce(left_join)
所做的。