我有一个很大的 R 数据框,其中第二列中有重复的化学名称,具有不同的“结果”和“使用”值。我想合并这些,以便每种化学品占一行,所有“结果”和“使用”值均在各自列的单个单元格中以逗号分隔。
示例 df:
df1 <- data.frame(Name=c("reservatol","reservatol","reservatol","DPG"),
Result = c("naturally occurring", "antagonist", "synthetic", "rubber"),
Use = c("Pharma", "Pharma", "Drugs and Medication", "Tires"))
Name Result Use
1 reservatol naturally occurring Pharma
2 reservatol antagonist Pharma
3 reservatol synthetic Drugs and Medication
4 DPG rubber Tires
有没有办法让它看起来像这样?
Name Result Use
1 reservatol naturally occurring, antagonist, synthetic Pharma, Drugs and Medication
2 DPG rubber Tires
我尝试使用 group_by("Name") %>% mutate(Use = Paste0(Result, Collapse=",") 希望连接 Uses,但它似乎没有做任何事情。
使用基本R
aggregate
> aggregate(.~Name, FUN = function(x) paste0(x, collapse = ","), data = df1)
Name Result Use
1 DPG rubber Tires
2 reservatol naturally occurring,antagonist,synthetic Pharma,Pharma,Drugs and Medication
或更短:
> aggregate(.~Name, FUN = \(x) c(x), data = df1)
Name Result Use
1 DPG rubber Tires
2 reservatol naturally occurring, antagonist, synthetic Pharma, Pharma, Drugs and Medication
你就快到了 - 使用
summarise
而不是 mutate
:
library(dplyr)
df1 %>%
group_by(Name) %>%
summarise(Result = paste(Result, collapse = ", "), Use = paste(Use, collapse = ", "))