我希望在unix中使用scaffold_mapping.txt文件重命名我的脚手架,其中.txt文件如下所示:
$ head scaffold_mapping.txt
>#ID_covAvg_fold_lengthLength
>scaffold_1_c1_cov61.3780_length417825
>scaffold_3_c1_cov45.0025_length77714
>scaffold_4_c1_cov84.2432_length70007
>scaffold_5_c2_cov57.6219_length67890
>scaffold_6_c1_cov331.1665_length65908
>scaffold_7_c1_cov138.5574_length64984
>scaffold_9_c1_cov77.1170_length59223
>scaffold_2_c2_cov51.1554_length55365
>scaffold_11_c1_cov44.1476_length53538
fasta 文件中的每个脚手架目前的命名如下:
> scaffold_1_c1
我希望它们的名称与scaffold_mapping.txt 文件相匹配,这样前面的示例将是:
> scaffold_1_c1_cov61.3780_length417825
我希望 sed 会很容易,但“>”使事情变得复杂
$ sed -f scaffold_mapping1.txt assembly.contigs.fasta > output1.fasta
sed: file scaffold_mapping1.txt line 1: unknown command: `>'
感谢您的耐心,因为我对 unix 很陌生。
我能说的最好的就是这就是你想要的:
$ awk '
NR==FNR {
if (NR>1) {
sub(/^>/,"")
match($0,/([^_]+_){3}/)
map[substr($0,1,RLENGTH-1)] = $0
}
next
}
/^>/ && ($2 in map) { $2=map[$2] }
{ print }
' scaffold_mapping.txt file
> scaffold_1_c1_cov61.3780_length417825