R中多个嵌套列表中的复杂数据框中的子列

问题描述 投票:0回答:1

我正在尝试从具有多个嵌套数据框的大列表中提取特定列。这是我的代码和输出数据:

str(ls1)
List of 2
 $ CAT1:'data.frame':   603 obs. of  2 variables:
  ..$ M12:'data.frame': 603 obs. of  5 variables:
  .. ..$ chr        : Factor w/ 598 levels "chr1-105554500-105557462",..: 44 45 46 47 48 49 50 51 52 53 ...
  .. ..$ gene.name  : Factor w/ 551 levels "ENSMUST00000000028-Cdc45",..: 214 184 309 271 267 102 50 315 348 220 ...
  .. ..$ gene.length: int [1:603] 4380 4842 4278 406 357 610 1439 2081 1123 2200 ...
  .. ..$ dir        : Factor w/ 2 levels "-","+": 1 2 1 1 1 2 2 1 2 1 ...
  .. ..$ read.ct    : int [1:603] 307 91 89 84 204 36 10 37 102 77 ...
  ..$ M14:'data.frame': 603 obs. of  5 variables:
  .. ..$ chr        : Factor w/ 596 levels "chr1-105554500-105557462",..: 45 46 47 48 49 50 51 52 53 54 ...
  .. ..$ gene.name  : Factor w/ 549 levels "ENSMUST00000000028-Cdc45",..: 215 184 312 274 270 103 52 318 351 221 ...
  .. ..$ gene.length: int [1:603] 4380 4842 4278 406 357 610 1439 2081 1123 2200 ...
  .. ..$ dir        : Factor w/ 2 levels "-","+": 1 2 1 1 1 2 2 1 2 1 ...
  .. ..$ read.ct    : int [1:603] 370 104 112 89 139 45 12 60 93 70 ...
 $ CAT2:'data.frame':   109 obs. of  2 variables:
  ..$ M12:'data.frame': 109 obs. of  5 variables:
  .. ..$ chr        : Factor w/ 80 levels "chr1-121307307-121312200",..: 6 7 8 1 9 10 2 3 11 12 ...
  .. ..$ gene.name  : Factor w/ 80 levels "ENSMUST00000000365-Mcts1",..: 9 69 71 7 44 58 63 17 32 12 ...
  .. ..$ gene.length: int [1:109] 4205 3229 32462 4894 2048 9952 1334 3698 1787 11235 ...
  .. ..$ dir        : Factor w/ 2 levels "-","+": 2 1 1 1 1 1 1 2 2 2 ...
  .. ..$ read.ct    : int [1:109] 4 2 1 12 18 1 3 1 3 3 ...
  ..$ M14:'data.frame': 109 obs. of  5 variables:
  .. ..$ chr        : Factor w/ 85 levels "chr1-121307307-121312200",..: 7 8 1 9 10 2 11 12 13 3 ...
  .. ..$ gene.name  : Factor w/ 85 levels "ENSMUST00000002291-Paxip1",..: 6 71 4 45 61 65 59 8 9 15 ...
  .. ..$ gene.length: int [1:109] 4205 3229 4894 2048 9952 1334 780 569 11235 1348 ...
  .. ..$ dir        : Factor w/ 2 levels "-","+": 2 1 1 1 1 1 2 2 2 1 ...
  .. ..$ read.ct    : int [1:109] 21 3 6 22 5 2 3 1 1 1 ...

我想要的是能够从每个子列表(即M12,M14)中提取gene.name和read.ct列。我希望它看起来像这样:

List of 2
$ CAT1:'data.frame':  603 obs. of  2 variables:
..$ M12:'data.frame':    603 obs. of  5 variables:
.. ..$ gene.name  : Factor w/ 551 levels "ENSMUST00000000028-Cdc45",..: 214 184 309 271 267 102 50 315 348 220 ...
.. ..$ read.ct    : int [1:603] 307 91 89 84 204 36 10 37 102 77 ...
..$ M14:'data.frame':    603 obs. of  5 variables:
.. ..$ gene.name  : Factor w/ 551 levels "ENSMUST00000000028-Cdc45",..: 214 184 309 271 267 102 50 315 348 220 ...
.. ..$ read.ct    : int [1:603] 307 91 89 84 204 36 10 37 102 77 ...
$ CAT2:'data.frame':  109 obs. of  2 variables:
..$ M12:'data.frame':    109 obs. of  5 variables:
.. ..$ gene.name  : Factor w/ 80 levels "ENSMUST00000000365-Mcts1",..: 9 69 71 7 44 58 63 17 32 12 ...
.. ..$ read.ct    : int [1:109] 4 2 1 12 18 1 3 1 3 3 ...
..$ M14:'data.frame':    109 obs. of  5 variables:
.. ..$ gene.name  : Factor w/ 85 levels "ENSMUST00000002291-Paxip1",..: 6 71 4 45 61 65 59 8 9 15 ...
.. ..$ read.ct    : int [1:109] 21 3 6 22 5 2 3 1 1 1 ...

我应该如何编写代码以获得上述所需的输出?我尝试了以下方法:

ls2 <- lapply(ls1, function(x) {
  y <- x[c(1:2)][c("gene.name", "read.ct")]
  return(y)
})

但出现错误:

Error in `[.data.frame`(x[c(1:2)], c("gene.name", "read.ct")) : 
  undefined columns selected 

任何帮助将不胜感激!谢谢。

r dataframe subset nested-lists
1个回答
0
投票

似乎data.frame嵌套在第一个数据集的列中

lapply(ls1, function(x) lapply(x[[1]], `[`, c("gene.name", "read.ct")))
© www.soinside.com 2019 - 2024. All rights reserved.