我有一个数据集女巫,看起来像这样:
X0.501_0.499.1 X0.400_0.600 X0.400_0.600.1 X1.Octanol X1.Propanol X2.Butanol X2.propanol X1.isobutanol
1 -39.91 -63.62 -53.14 1 0 0 0 0
2 -48.68 -73.45 -63.17 1 0 0 0 0
3 -57.89 -84.45 -73.99 1 0 0 0 0
4 -65.99 -92.61 -83.37 1 0 0 0 0
5 -72.62 -101.14 -91.33 1 0 0 0 0
6 -167.42 -263.80 -218.03 0 1 0 0 0
我想将最后5列合并为1个结果,如下所示:
X0.501_0.499.1 X0.400_0.600 X0.400_0.600.1 Type
1 -39.91 -63.62 -53.14 X1.Octanol
2 -48.68 -73.45 -63.17 X1.Octanol
3 -57.89 -84.45 -73.99 X1.Octanol
4 -65.99 -92.61 -83.37 X1.Octanol
5 -72.62 -101.14 -91.33 X1.Octanol
6 -167.42 -263.80 -218.03 X1.Propanol
任何人都可以为我提供解决方案吗?
这里是另一个使用矩阵乘法来获取列名的基本R解决方案:
dfout <- cbind(df[1:3],Type=names(u<-df[-(1:3)])[as.matrix(u) %*% t(t(1:ncol(u)))])
诸如此类
> dfout
X0.501_0.499.1 X0.400_0.600 X0.400_0.600.1 Type
1 -39.91 -63.62 -53.14 X1.Octanol
2 -48.68 -73.45 -63.17 X1.Octanol
3 -57.89 -84.45 -73.99 X1.Octanol
4 -65.99 -92.61 -83.37 X1.Octanol
5 -72.62 -101.14 -91.33 X1.Octanol
6 -167.42 -263.80 -218.03 X1.Propanol
我们可以使用max.col
直接获取每一行的列索引,并将其替换为数据集的列名
Type <- names(df1)[4:ncol(df1)][max.col(df1[4:ncol(df1)], 'first')]
df2 <- cbind(df1[1:3], Type = Type)
df2
# X0.501_0.499.1 X0.400_0.600 X0.400_0.600.1 Type
#1 -39.91 -63.62 -53.14 X1.Octanol
#2 -48.68 -73.45 -63.17 X1.Octanol
#3 -57.89 -84.45 -73.99 X1.Octanol
#4 -65.99 -92.61 -83.37 X1.Octanol
#5 -72.62 -101.14 -91.33 X1.Octanol
#6 -167.42 -263.80 -218.03 X1.Propanol
df1 <- structure(list(X0.501_0.499.1 = c(-39.91, -48.68, -57.89, -65.99,
-72.62, -167.42), X0.400_0.600 = c(-63.62, -73.45, -84.45, -92.61,
-101.14, -263.8), X0.400_0.600.1 = c(-53.14, -63.17, -73.99,
-83.37, -91.33, -218.03), X1.Octanol = c(1L, 1L, 1L, 1L, 1L,
0L), X1.Propanol = c(0L, 0L, 0L, 0L, 0L, 1L), X2.Butanol = c(0L,
0L, 0L, 0L, 0L, 0L), X2.propanol = c(0L, 0L, 0L, 0L, 0L, 0L),
X1.isobutanol = c(0L, 0L, 0L, 0L, 0L, 0L)),
class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))
整理方式:
data%>%pivot_longer(-c(1:3), names_to="Type")%>%filter(value==1)%>%select(-value)