filter() 函数在 for 循环中不起作用

问题描述 投票:0回答:2

我写了下面的代码

miRNA.list <- c("let-7a-5p", "let-7a-1-3p", "let-7b-5p")
summary.df <- data.frame()
for (miRNA in miRNA.list) {
  
  temp.name <- miRNA
  
  temp.df <- df.mirna.pv %>%
              filter(`temp.name` == "yes") %>%
              summarise(downregulated = sum(str_count(status, "downregulated")),
                        upregulated = sum(str_count(status, "upregulated")),
                        all = n())
  
  summary.df <- rbind(summary.df, temp.df)
  
}

根据“let-xxx”列过滤以下数据框,然后计算上调或下调基因的数量;

print(df.mirna.pv)

          let-7a-5p    let-7a-1-3p   let-7b-5p            status
Xkr4          no            yes          no               upregulated
Mrpl15        yes           yes          no               downregulated
Lypla1        yes           yes          yes              downregulated 
Tcea1         no            yes          no               not significant  

但是,由于某种原因,它无法将 miRNA 列表中的名称与列名称匹配,或者至少我认为这是问题所在,因为这是我的输出:

downregulated upregulated all
1             0           0   0
2             0           0   0
3             0           0   0
4             0           0   0
5             0           0   0
6             0           0   0

有什么想法可能会发生什么以及如何解决它吗?

r dataframe for-loop dplyr
2个回答
0
投票

您正在混合使用

dplyr
的正常使用和编程使用。也就是说,
filter(`temp.name` == "yes")
正在寻找名为
"temp.name"
的列,而不是在局部变量
temp.name
中间接引用的列。

我想这可能就是你想要的?

library(dplyr)
library(tidyr)
tmp <- pivot_longer(quux, cols = -status)
count(tmp, name, status) |>
  pivot_wider(id_cols = name, names_from = status, values_from = n) |>
  left_join(count(tmp, name, name = "all"), by = "name")
# # A tibble: 3 × 5
#   name        downregulated `not significant` upregulated   all
#   <chr>               <int>             <int>       <int> <int>
# 1 let-7a-1-3p             2                 1           1     4
# 2 let-7a-5p               2                 1           1     4
# 3 let-7b-5p               2                 1           1     4

如果不需要,可以删除

`not significant`


数据

quux <- structure(list("let-7a-5p" = c("no", "yes", "yes", "no"), "let-7a-1-3p" = c("yes", "yes", "yes", "yes"), "let-7b-5p" = c("no", "no", "yes", "no"), status = c("upregulated", "downregulated", "downregulated", "not significant")), row.names = c("Xkr4", "Mrpl15", "Lypla1", "Tcea1"), class = "data.frame")

0
投票

如果你想修复 for 循环,你需要将

!!sym(temp.name)
传递给
dplyr::filter()

library(dplyr)

df.mirna.pv <- structure(list("let-7a-5p" = c("no", "yes", "yes", "no"), 
                       "let-7a-1-3p" = c("yes", "yes", "yes", "yes"), 
                       "let-7b-5p" = c("no", "no", "yes", "no"), 
                       status = c("upregulated", "downregulated",
                                  "downregulated", "not significant")), 
                  row.names = c("Xkr4", "Mrpl15", "Lypla1", "Tcea1"), 
                  class = "data.frame")

miRNA.list <- c("let-7a-5p", "let-7a-1-3p", "let-7b-5p")
summary.df <- data.frame()
for (miRNA in miRNA.list) {
  
  temp.name <- miRNA
  
  temp.df <- df.mirna.pv %>%
    filter(!!sym(temp.name) == "yes") %>%
    summarise(downregulated = sum(str_count(status, "downregulated")),
              upregulated = sum(str_count(status, "upregulated")),
              all = n())
  
  summary.df <- rbind(summary.df, temp.df)
  
}

summary.df %>% mutate(name = miRNA.list, .before = 1)

#>          name downregulated upregulated all
#> 1   let-7a-5p             2           0   2
#> 2 let-7a-1-3p             2           1   4
#> 3   let-7b-5p             1           0   1

但是我们可以使用

tidyr::pivot_longer()
更轻松地做到这一点:

df.mirna.pv %>% 
  tidyr::pivot_longer(-status) %>% 
  filter(value == "yes") %>% 
  summarise(name = first(name), 
            downregulated = sum(status == "downregulated"),
            upregulated = sum(status == "upregulated"),
            all = n(),
            .by = name)
© www.soinside.com 2019 - 2024. All rights reserved.