自动更改列表的列表

问题描述 投票:0回答:1

如何改变 10 列,如果基因在模块内部,则包含 TRUE,如果不在模块内部,则包含 FALSE?

gene_express = data.frame(gene = c('gene1', 'gene2', 'gene3', 'gene4', 'gene5', 
'gene6', 'gene7', 'gene8', 'gene9', 'gene10'), sample1 = sample(0:10,10), sample2 = sample(0:10,10), sample3 = sample(0:10,10), sample4 = sample(0:10,10)) 
module1 = c('gene1', 'gene2', 'gene10', 'gene8')
module2 = c('gene2', 'gene9', 'gene6', 'gene5', 'gene10')
module3 = c('gene4', 'gene10', 'gene1', 'gene8')
module4 = c('gene5', 'gene8', 'gene2', 'gene7', 'gene6', 'gene5', 'gene10')
module5 = c('gene2', 'gene9', 'gene6', 'gene5', 'gene10')
module6 = c('gene4', 'gene10', 'gene1', 'gene8')
Module_list = list(module1, module2, module3, module4, module5, module6)
names(Module_list) <- c('module1', 'module2', 'module3', 
'module4', 'module5', 'module6')

实际上,我有数百个这样的模块,它们已被放入列表的命名列表中,就像我的示例“Module_list”一样。如何改变“gene_express”数据框,使模块名称成为新列,如果基因位于模块内部,则包含 TRUE,如果不在模块内部,则包含 FALSE?

手动方式是在 mutate 函数中指定模块组件,就像我在这里那样

当前代码

gene_express %>% mutate(
module1 = case_match(gene, c("gene1", "gene2", "gene8", "gene10") ~ TRUE, .default = FALSE),
module2 = case_match(gene, c("gene2", "gene9", "gene6", "gene5", "gene10") ~ TRUE, .default = FALSE),
module3 = case_match(gene, c("gene4", "gene10", "gene1", "gene8") ~ TRUE, .default = FALSE),
module4 = case_match(gene, c("gene2", "gene9", "gene6", "gene5", "gene10") ~ TRUE, .default = FALSE),
module5 = case_match(gene, c("gene4", "gene10", "gene1", "gene8") ~ TRUE, .default = FALSE),
module6 = case_match(gene, c("gene5", "gene2", "gene7", "gene8", "gene6", "gene10") ~ TRUE, .default = FALSE))

我想要的是避免在 mutate 中手动指定模块。

r dataframe dplyr
1个回答
0
投票

也许是这样的?在这里,我将按模块排列的基因列表放入数据框中,然后我们可以连接到原始数据并用 FALSE 填充未连接的元素。

Module_df <- Module_list |>
  map_dfr(as.data.frame, .id = "module") |>
  rename(gene = 2)

gene_express |>
  left_join(Module_df |> mutate(val = TRUE)) |>
  pivot_wider(names_from = module, values_from = val, 
              values_fn = first, values_fill = FALSE)

结果

# A tibble: 10 × 12
   gene   sample1 sample2 sample3 sample4 module1 module3 module6 module2 module4 module5 `NA` 
   <chr>    <int>   <int>   <int>   <int> <lgl>   <lgl>   <lgl>   <lgl>   <lgl>   <lgl>   <lgl>
 1 gene1       10       0       3       4 TRUE    TRUE    TRUE    FALSE   FALSE   FALSE   FALSE
 2 gene2        5       8       5       5 TRUE    FALSE   FALSE   TRUE    TRUE    TRUE    FALSE
 3 gene3        8       9       7       2 FALSE   FALSE   FALSE   FALSE   FALSE   FALSE   NA   
 4 gene4        1       5       9       0 FALSE   TRUE    TRUE    FALSE   FALSE   FALSE   FALSE
 5 gene5        4       4       8       3 FALSE   FALSE   FALSE   TRUE    TRUE    TRUE    FALSE
 6 gene6        6      10       0       9 FALSE   FALSE   FALSE   TRUE    TRUE    TRUE    FALSE
 7 gene7        3       1       1       7 FALSE   FALSE   FALSE   FALSE   TRUE    FALSE   FALSE
 8 gene8        2       3       6       6 TRUE    TRUE    TRUE    FALSE   TRUE    FALSE   FALSE
 9 gene9        0       2       4       1 FALSE   FALSE   FALSE   TRUE    FALSE   TRUE    FALSE
10 gene10       7       6       2      10 TRUE    TRUE    TRUE    TRUE    TRUE    TRUE    FALSE
© www.soinside.com 2019 - 2024. All rights reserved.