使r忽略值在列中出现的顺序(通过粘贴多列创建)

问题描述 投票:1回答:1

鉴于变量x可以取值A,B,C,D

可变x的三列:

df1<- 
rbind(c("A","B","C"),c("A","D","C"),c("B","A","C"),c("A","C","B"), c("B","C","A"), c("D","A","B"), c("A","B","D"), c("A","D","C"), c("A",NA,NA),c("D","A",NA),c("A","D",NA))

如何将列表示前三列中的组合使得排列(ABC,ACB,BAC)被视为ABC的相同组合,(AD,DA)将被视为AD的相同组合?

apply(df1,1,function(x) paste(x[!is.na(x)], collapse=", ")->df1$x4df1%>%group(x4)%>%summarize(c=count(x4))粘贴三列将AD,DA视为不同而不是相同。

编辑标题

我想要的结果是获得<-cbind(c(“ABC”,4),c(“ACD”,2),c(“ABD”,2),c(“A”,1),c( “AD”,2))

有人已经解决了我的问题。谢谢

r dataframe data-cleaning
1个回答
2
投票

排序每行向量后,你可以apply函数paste

df1 <- 
  cbind(df1, apply(df1, 1, function(x) paste(sort(x), collapse = "")))

df1
#      [,1] [,2] [,3] [,4] 
# [1,] "A"  "B"  "C"  "ABC"
# [2,] "A"  "D"  "C"  "ACD"
# [3,] "B"  "A"  "C"  "ABC"
# [4,] "A"  "C"  "B"  "ABC"
# [5,] "B"  "C"  "A"  "ABC"
# [6,] "D"  "A"  "B"  "ABD"
# [7,] "A"  "B"  "D"  "ABD"
# [8,] "A"  "D"  "C"  "ACD"
# [9,] "A"  NA   NA   "A"  
#[10,] "D"  "A"  NA   "AD" 
#[11,] "A"  "D"  NA   "AD"

你现在可以简单地table列,不需要加载外部包和更复杂的管道。

table(df1[, 4])
#A ABC ABD ACD  AD 
#1   4   2   2   2 
© www.soinside.com 2019 - 2024. All rights reserved.