尝试进行组合分析,但在突变过程中1变成了0

问题描述 投票:0回答:1

我正在尝试进行组合分析,以图表形式显示结果。我有一个包含 9 列的数据框,如果样本中不存在某个值,则每列都包含不同的百分比或 NA。

我为此使用的示例代码可以在这里找到:https://epirhandbook.com/en/combinations-analysis.html

问题是一行中的 1 会变成 0,反之亦然。线路是:

data <- data %>%
  mutate(across(all_of(columns), ~ as.integer(. %in% c("yes", NA))))

我使用的完整代码是:

library(tidyverse)
library(UpSetR)
library(ggupset)

data <- META_new[c("lengthpergram","countpergram","acrylrel",
                   "cottonrel","polyestrel","polyamiderel",
                   "elastaanrel","lyocellrel","viscoserel",
                   "nylonrel","wolrel")]

columns <- c("acrylrel", "cottonrel", "polyestrel", "polyamiderel",
             "elastaanrel", "lyocellrel", "viscoserel", "nylonrel", "wolrel")

for (col in columns) {
  data[[col]][data[[col]] > 0] <- "yes"
  data[[col]][data[[col]] == 0] <- NA
}

data <- data %>%
  mutate(acrylrel = ifelse(acrylrel == "yes", 1, 0),
         cottonrel = ifelse(cottonrel == "yes", 1, 0),
         polyestrel = ifelse(polyestrel == "yes", 1, 0),
         polyamiderel = ifelse(polyamiderel == "yes", 1, 0),
         elastaanrel = ifelse(elastaanrel == "yes", 1, 0),
         lyocellrel = ifelse(lyocellrel == "yes", 1, 0),
         viscoserel = ifelse(viscoserel == "yes", 1, 0),
         nylonrel = ifelse(nylonrel == "yes", 1, 0),
         wolrel = ifelse(wolrel== "yes", 1, 0),)

data <- data %>%
  mutate(across(all_of(columns), ~ as.integer(. %in% c("yes", NA))))

data %>%
  UpSetR::upset(
    sets = columns,
    order.by = "freq",
    sets.bar.color = c("red", "orange", "yellow", "green", "cyan", "blue", "purple", "pink", "salmon"),
    empty.intersections = "on",
    number.angles = 0,
    point.size = 2,
    line.size = 1, 
    mainbar.y.label = "Fabric combinations by frequency",
    sets.x.label = "Types of fabric present in samples")

代码给出了很好的情节。但它为值分配了错误的列名。例如,聚酯纤维应该是最常见的组合,但分配了 lyocellrel,即使 lyocellrel 是最不常见的。

不幸的是,我无法添加 df,因为它太大了,但我希望有人对如何解决此问题提出建议(如果这一行甚至是问题)。

我更改了网站原有的一些代码,原文:

 mutate(across(c(fever, chills, cough, aches, vomit), .fns = ~+(.x == "yes")))

因为当我尝试时,我得到了这个错误:

Error in start_col:end_col : argument of length 0

前5行

data <- data <- data.frame(
  acrylrel = c(0.00000, 0.00000, 0.00000, 36.61972, 0.00000),
  cottonrel = c(9.089974, 65.000000, 0.000000, 19.014085, 8.500000),
  polyestrel = c(83.72237, 35.00000, 42.81081, 44.36620, 15.00000),
  polyamiderel = c(5.583548, 0.000000, 53.594595, 0.000000, 40.000000),
  elastaanrel = c(1.604113, 0.000000, 3.594595, 0.000000, 1.500000),
  lyocellrel = c(0, 0, 0, 0, 0),
  viscoserel = c(0, 0, 0, 0, 0),
  nylonrel = c(0, 0, 0, 0, 0),
  wolrel = c(0, 0, 0, 0, 0)
)
r dplyr combinations analysis mutate
1个回答
0
投票

这似乎就是您想要的:

data %>%
  mutate(across(everything(), ~ as.integer(. > 0))) %>%
  UpSetR::upset(
    sets = columns,
    order.by = "freq",
    sets.bar.color = c("red", "orange", "yellow", "green", "cyan", "blue", "purple", "pink", "salmon"),
    empty.intersections = "on",
    number.angles = 0,
    point.size = 2,
    line.size = 1, 
    mainbar.y.label = "Fabric combinations by frequency",
    sets.x.label = "Types of fabric present in samples")

输出:

逐部分浏览您的代码:

# this turns every value into "yes" if positive, or NA if 0
for (col in columns) {
  data[[col]][data[[col]] > 0] <- "yes"
  data[[col]][data[[col]] == 0] <- NA
}

# this is the same as above, but all of the "yes" values have been turned into 1s. Note that (frustratingly!) NA == "yes" is NA, not FALSE, as you would think. The way to check for NA values is with the function is.na()
data %>%
  mutate(acrylrel = ifelse(acrylrel == "yes", 1, 0),
         cottonrel = ifelse(cottonrel == "yes", 1, 0),
         polyestrel = ifelse(polyestrel == "yes", 1, 0),
         polyamiderel = ifelse(polyamiderel == "yes", 1, 0),
         elastaanrel = ifelse(elastaanrel == "yes", 1, 0),
         lyocellrel = ifelse(lyocellrel == "yes", 1, 0),
         viscoserel = ifelse(viscoserel == "yes", 1, 0),
         nylonrel = ifelse(nylonrel == "yes", 1, 0),
         wolrel = ifelse(wolrel== "yes", 1, 0),)

# with this line, because you've already turned the "yes" values into 1s, `. %in% c("yes", NA)` evaluates to FALSE for the 1s and TRUE for the NA values (oddly this works)
data <- data %>%
  mutate(across(all_of(columns), ~ as.integer(. %in% c("yes", NA))))
© www.soinside.com 2019 - 2024. All rights reserved.