我有一个包含字符和数字值的字符串的向量。例如:
a=c("ILLUMINA:420:C2D7UACXX:1:1102:14591:91480","ILLUMINA:420:C2D7UACXX:1:1102:14592:3881","ILLUMINA:420:C2D7UACXX:1:1102:14592:37103","ILLUMINA:420:C2D7UACXX:1:1102:14592:37356")
我想订购的载体,这样的字符是字母顺序排序和数字数字。字符串的结构始终的格式:"ILLUMINA:420:C2D7UACXX:1:<number>:<number>:<number>"
,所以实际上该命令只适用于最后三个冒号分隔数字。
我也尝试mixedsort {gtools}
但结果却是一样的使用sort
和
sort.int,其是:
> mixedsort(a)
[1] "ILLUMINA:420:C2D7UACXX:1:1102:14591:91480" "ILLUMINA:420:C2D7UACXX:1:1102:14592:37103"
[3] "ILLUMINA:420:C2D7UACXX:1:1102:14592:37356" "ILLUMINA:420:C2D7UACXX:1:1102:14592:3881"
显然,正确的顺序应该是:
[1] "ILLUMINA:420:C2D7UACXX:1:1102:14591:91480" "ILLUMINA:420:C2D7UACXX:1:1102:14592:3881"
[3] "ILLUMINA:420:C2D7UACXX:1:1102:14592:37103" "ILLUMINA:420:C2D7UACXX:1:1102:14592:37356"
是否有任何直接的解决办法?
编辑彻底改变OP澄清后的溶液
您可以提取最后3种元素和秩序,与您共创data.frame:
dat = read.table(text=sub('.*:1:([0-9]+):([0-9]+):([0-9]+)','\\1|\\2|\\3',a),sep='|')
dat
V1 V2 V3
1 1102 14591 91480
2 1102 14592 3881
3 1102 14592 37103
4 1102 14592 37356
然后你为了使用3列:
a[with(dat,order(V1,V2,V3))]
[1] "ILLUMINA:420:C2D7UACXX:1:1102:14591:91480" "ILLUMINA:420:C2D7UACXX:1:1102:14592:3881"
[3] "ILLUMINA:420:C2D7UACXX:1:1102:14592:37103" "ILLUMINA:420:C2D7UACXX:1:1102:14592:37356"
gtools :: mixedsort你的情况确实工作,实际上是:
> a=c("ILLUMINA:420:C2D7UACXX:1:1102:14591:91480",
"ILLUMINA:420:C2D7UACXX:1:1102:14592:3881",
"ILLUMINA:420:C2D7UACXX:1:1102:14592:37103",
"ILLUMINA:420:C2D7UACXX:1:1102:14592:37356")
>
> mixedsort(a)
[1] "ILLUMINA:420:C2D7UACXX:1:1102:14591:91480"
[2] "ILLUMINA:420:C2D7UACXX:1:1102:14592:3881"
[3] "ILLUMINA:420:C2D7UACXX:1:1102:14592:37103"
[4] "ILLUMINA:420:C2D7UACXX:1:1102:14592:37356"
我使用gtools_3.4.2和R-3.2.0
这里有一个更快的解决方案:
fields.list = strsplit(a,split=":")
sort.dt = data.table(t(sapply(fields.list,function(x) as.numeric(c(x[5],x[6],x[7])))))
sorted.a = v[with(sort.dt,order(V1,V2,V3))]
> sorted.a
[1] "ILLUMINA:420:C2D7UACXX:1:1102:14591:91480" "ILLUMINA:420:C2D7UACXX:1:1102:14592:3881" "ILLUMINA:420:C2D7UACXX:1:1102:14592:37103"
[4] "ILLUMINA:420:C2D7UACXX:1:1102:14592:37356"