如何对DNAStringSet对象进行排序?

问题描述 投票:1回答:1

我有一个xstringset对象

 A DNAStringSet instance of length 151674
          width seq                                                                names               
    [1]     253 GAACAGCATGAATGTTAAAACTGAAATGGATG...TGATGGTTAGGTTTTCAGAAAAAGCAGAAGA LGKD01000001.1 Oc...
    [2]  150158 TATATATATATAGTCAATTCGAGGATGTTAGA...TCCGGATACTATTCCAGAGTTTCCTTGCAAA KQ415657.1 Octopu...
    [3]     619 ATAGACATACACACAAATATTTTTATATCACA...TATATACATATTTATACATATATATATATAT LGKD01000030.1 Oc...
    [4]     359 TCACCAGTGGCAGCCGCGGCTACAGCAAAAGG...CACGGGCTGTACAACGACCCTGATGACTCCG LGKD01000031.1 Oc...
    [5]     239 GAAGTGGTAAAGAGTGCGATGCGCTGAAAAAA...CTCTTTTTTCAGCGCATCGCACTCTTTACCA LGKD01000032.1 Oc...
    ...     ... ...
[151670]    2021 AAAACCTAAACATGTTAAATCAGAGATTGCAA...ATATATAAGTATATATATATATATATATATA KQ434080.1 Octopu...
[151671]     420 CCCCACCTCCACTATCAACACCACTACCACCA...GAAGAAGAAGAAGAAGAAGAAGAAGAAGAAG LGKD01700121.1 Oc...
[151672]     424 ACACACACACACACACACACACATATACATAT...GTAAATGTGTCCGTGTGTAGTAAGCATGTGT LGKD01700122.1 Oc...
[151673]     242 ATATATATATATATATATACATCAACATATAT...ATATGTAGACGTGTGTGTATATATATATATA LGKD01700123.1 Oc...
[151674]     214 CACACACACACACACACACACACACACACACA...ACTCATATGTACAACACACATTTATACGCTT LGKD01700124.1 Oc...
>  

我按降序排序,得到这个:

> sort_oc=sort(width(oc), decreasing = TRUE)

> sort_oc[1:10]
[1] 4064693 3315273 3181678 3174068 2987449 2908116 2784626 2705535 2686354 2631168

如何获得通过排序获得的每个宽度的对应字符串?

我希望例如这样的结果:

          width   seq                                                                names               
     [567] 4064693 GAACAGCATGAATGTTAAAACTGAAATGGATG...TGATGGTTAGGTTTTCAGAAAAAGCAGAAGA  LGKD01000001.1 Oc...           
     [350] 3315273 AAAACCTAAACATGTTAAATCAGAGATTGCAA...ATATATAAGTATATATATATATATATATATA KQ434080.1 Octopu... 

等等

r bioconductor
1个回答
2
投票

Andrew's答案非常接近,但由于DNAStringSet不是data.frame,你需要使用Biostrings::width函数,而不是正常的子集,来获得宽度:

oc[order(width(oc), decreasing = T),]

这将返回相同的DNAStringSet对象,按宽度按降序排列

© www.soinside.com 2019 - 2024. All rights reserved.