扩大与多个数据条目成各个行线与一个片的每个数据的

问题描述 投票:1回答:3

我有一个文件,在其中的第一列是一个标识符和每行的其余部分包含零至由单个空格分隔的多个位数。

例如:

SOAP.k35.scaffold280 0003723 
SOAP.k35.scaffold421 
SOAP.k35.scaffold429 0004930 0016021
TRINITY_DN23171_c1_g1_i2 0006457 0005509 0030246 0051082 0005788
SOAP.k35.scaffold599 0007411 0033627 0035001 0016321 0007507 0035011 0007498 0045886 0030155 0030334 0045995 0034446 0005102 0030424 0005604 0030054 0036062 0008021

我想有一个与相应的第一列标识符在单独的行每个后数字输入(即SOAP ...或TRINITY ....),从而导致每行带有附加“=”每个第一列标识与给定数量之间该行。我还希望移除包含第一列标识后没有数字线。

由于我会在处理文本的结果,以上是一个例子:

SOAP.k35.scaffold280 = 0003723
SOAP.k35.scaffold429 = 0004930
SOAP.k35.scaffold429 = 0016021
TRINITY_DN23171_c1_g1_i2 = 0006457
TRINITY_DN23171_c1_g1_i2 = 0005509
TRINITY_DN23171_c1_g1_i2 = 0030246

......等等。

我的主要问题是知道如何存储第一列标识符提前插入任何新行字符我解析由数字数据输入线时插入。

任何帮助是极大的赞赏。

bash parsing text awk grep
3个回答
1
投票

只是,

$ awk '{for(i=2;i<=NF;i++) print $1,"=",$i}' file

SOAP.k35.scaffold280 = 0003723
SOAP.k35.scaffold429 = 0004930
SOAP.k35.scaffold429 = 0016021
TRINITY_DN23171_c1_g1_i2 = 0006457
TRINITY_DN23171_c1_g1_i2 = 0005509
TRINITY_DN23171_c1_g1_i2 = 0030246
TRINITY_DN23171_c1_g1_i2 = 0051082
TRINITY_DN23171_c1_g1_i2 = 0005788
...

1
投票

可否请您尝试以下操作。

awk '(/^SOAP/ || /^TRINITY/){for(i=2;i<=NF;i++){print $1" = "$i}}' Input_file

如果你不想严格awk程序仅适用于开始要么用绳子SOAPTRINITY然后尝试以下行。

awk '{for(i=2;i<=NF;i++){print $1" = "$i}}' Input_file

输出将是如下。

SOAP.k35.scaffold280 = 0003723
SOAP.k35.scaffold429 = 0004930
SOAP.k35.scaffold429 = 0016021
TRINITY_DN23171_c1_g1_i2 = 0006457
TRINITY_DN23171_c1_g1_i2 = 0005509
TRINITY_DN23171_c1_g1_i2 = 0030246
TRINITY_DN23171_c1_g1_i2 = 0051082
TRINITY_DN23171_c1_g1_i2 = 0005788
SOAP.k35.scaffold599 = 0007411
SOAP.k35.scaffold599 = 0033627
SOAP.k35.scaffold599 = 0035001
SOAP.k35.scaffold599 = 0016321
SOAP.k35.scaffold599 = 0007507
SOAP.k35.scaffold599 = 0035011
SOAP.k35.scaffold599 = 0007498
SOAP.k35.scaffold599 = 0045886
SOAP.k35.scaffold599 = 0030155
SOAP.k35.scaffold599 = 0030334
SOAP.k35.scaffold599 = 0045995
SOAP.k35.scaffold599 = 0034446
SOAP.k35.scaffold599 = 0005102
SOAP.k35.scaffold599 = 0030424
SOAP.k35.scaffold599 = 0005604
SOAP.k35.scaffold599 = 0030054
SOAP.k35.scaffold599 = 0036062
SOAP.k35.scaffold599 = 0008021

0
投票

您可以尝试的Perl也

$ perl -ne ' ($x)=$_=~m/(^\S+)/; while( /\s(\d+)/g ) { print "$x = $1\n" } ' scottc.txt
SOAP.k35.scaffold280 = 0003723
SOAP.k35.scaffold429 = 0004930
SOAP.k35.scaffold429 = 0016021
TRINITY_DN23171_c1_g1_i2 = 0006457
TRINITY_DN23171_c1_g1_i2 = 0005509
TRINITY_DN23171_c1_g1_i2 = 0030246
TRINITY_DN23171_c1_g1_i2 = 0051082
TRINITY_DN23171_c1_g1_i2 = 0005788
SOAP.k35.scaffold599 = 0007411
SOAP.k35.scaffold599 = 0033627
SOAP.k35.scaffold599 = 0035001
SOAP.k35.scaffold599 = 0016321
SOAP.k35.scaffold599 = 0007507
SOAP.k35.scaffold599 = 0035011
. . . . . 
. . . . . 
© www.soinside.com 2019 - 2024. All rights reserved.