使用映射txt文件重命名fasta文件

问题描述 投票:0回答:1

我希望在unix中使用scaffold_mapping.txt文件重命名我的脚手架,其中.txt文件如下所示:

$ head scaffold_mapping.txt 
>#ID_covAvg_fold_lengthLength
>scaffold_1_c1_cov61.3780_length417825
>scaffold_3_c1_cov45.0025_length77714
>scaffold_4_c1_cov84.2432_length70007
>scaffold_5_c2_cov57.6219_length67890
>scaffold_6_c1_cov331.1665_length65908
>scaffold_7_c1_cov138.5574_length64984
>scaffold_9_c1_cov77.1170_length59223
>scaffold_2_c2_cov51.1554_length55365
>scaffold_11_c1_cov44.1476_length53538

fasta 文件中的每个脚手架目前的命名如下:

> scaffold_1_c1

我希望它们的名称与scaffold_mapping.txt 文件相匹配,这样前面的示例将是:

> scaffold_1_c1_cov61.3780_length417825

我希望 sed 会很容易,但“>”使事情变得复杂

$ sed -f scaffold_mapping1.txt assembly.contigs.fasta > output1.fasta
sed: file scaffold_mapping1.txt line 1: unknown command: `>'

感谢您的耐心,因为我对 unix 很陌生。

unix sed rename fasta scaffold
1个回答
0
投票

我能说的最好的就是这就是你想要的:

$ awk '
    NR==FNR {
        if (NR>1) {
            sub(/^>/,"")
            match($0,/([^_]+_){3}/)
            map[substr($0,1,RLENGTH-1)] = $0
        }
        next
    }
    /^>/ && ($2 in map) { $2=map[$2] }
    { print }
' scaffold_mapping.txt file
> scaffold_1_c1_cov61.3780_length417825
© www.soinside.com 2019 - 2024. All rights reserved.