使用r将布尔列合并为1

问题描述 投票:0回答:3

我有一个数据集女巫,看起来像这样:

  X0.501_0.499.1 X0.400_0.600 X0.400_0.600.1 X1.Octanol X1.Propanol X2.Butanol X2.propanol X1.isobutanol
1         -39.91       -63.62         -53.14          1           0          0           0             0
2         -48.68       -73.45         -63.17          1           0          0           0             0
3         -57.89       -84.45         -73.99          1           0          0           0             0
4         -65.99       -92.61         -83.37          1           0          0           0             0
5         -72.62      -101.14         -91.33          1           0          0           0             0
6        -167.42      -263.80        -218.03          0           1          0           0             0

我想将最后5列合并为1个结果,如下所示:

  X0.501_0.499.1 X0.400_0.600 X0.400_0.600.1  Type 
1         -39.91       -63.62         -53.14  X1.Octanol          
2         -48.68       -73.45         -63.17  X1.Octanol           
3         -57.89       -84.45         -73.99  X1.Octanol           
4         -65.99       -92.61         -83.37  X1.Octanol           
5         -72.62      -101.14         -91.33  X1.Octanol           
6        -167.42      -263.80        -218.03  X1.Propanol          

任何人都可以为我提供解决方案吗?

r artificial-intelligence data-manipulation decision-tree
3个回答
0
投票

这里是另一个使用矩阵乘法来获取列名的基本R解决方案:

dfout <- cbind(df[1:3],Type=names(u<-df[-(1:3)])[as.matrix(u) %*% t(t(1:ncol(u)))])

诸如此类

> dfout
  X0.501_0.499.1 X0.400_0.600 X0.400_0.600.1        Type
1         -39.91       -63.62         -53.14  X1.Octanol
2         -48.68       -73.45         -63.17  X1.Octanol
3         -57.89       -84.45         -73.99  X1.Octanol
4         -65.99       -92.61         -83.37  X1.Octanol
5         -72.62      -101.14         -91.33  X1.Octanol
6        -167.42      -263.80        -218.03 X1.Propanol

0
投票

我们可以使用max.col直接获取每一行的列索引,并将其替换为数据集的列名

Type <- names(df1)[4:ncol(df1)][max.col(df1[4:ncol(df1)], 'first')]
df2 <- cbind(df1[1:3], Type = Type)
df2
#  X0.501_0.499.1 X0.400_0.600 X0.400_0.600.1        Type
#1         -39.91       -63.62         -53.14  X1.Octanol
#2         -48.68       -73.45         -63.17  X1.Octanol
#3         -57.89       -84.45         -73.99  X1.Octanol
#4         -65.99       -92.61         -83.37  X1.Octanol
#5         -72.62      -101.14         -91.33  X1.Octanol
#6        -167.42      -263.80        -218.03 X1.Propanol

数据

df1 <- structure(list(X0.501_0.499.1 = c(-39.91, -48.68, -57.89, -65.99, 
-72.62, -167.42), X0.400_0.600 = c(-63.62, -73.45, -84.45, -92.61, 
-101.14, -263.8), X0.400_0.600.1 = c(-53.14, -63.17, -73.99, 
-83.37, -91.33, -218.03), X1.Octanol = c(1L, 1L, 1L, 1L, 1L, 
0L), X1.Propanol = c(0L, 0L, 0L, 0L, 0L, 1L), X2.Butanol = c(0L, 
0L, 0L, 0L, 0L, 0L), X2.propanol = c(0L, 0L, 0L, 0L, 0L, 0L), 
    X1.isobutanol = c(0L, 0L, 0L, 0L, 0L, 0L)),
    class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6"))

0
投票

整理方式:

data%>%pivot_longer(-c(1:3), names_to="Type")%>%filter(value==1)%>%select(-value)
© www.soinside.com 2019 - 2024. All rights reserved.