如何在bash中替换相同字符且可变长度的序列?

问题描述 投票:0回答:1

我很确定这个问题已经有答案了,但我找不到答案,所以如果有的话,请在评论中链接。

否则,我需要解决的问题是如何用单个字符替换可能出现一次或多次的相同字符的序列,并且我们不知道最多出现多少次,以便用已知分隔符。

另外,在我的具体情况下,我必须替换

*
但我可以进行预处理以用更易于处理的字符替换它。

这是一个非常糟糕的解决方案,它假设模式的最大长度是已知的。但是,当然,这不是真的。

cat example_file.txt | sed 's/\*\*\*\*\*\*\*\*/_/g' | sed 's/\*\*\*\*\*\*\*/_/g' | sed 's/\*\*\*\*\*\*/_/g' | sed 's/\*\*\*\*\*/_/g' | sed 's/\*\*\*\*/_/g' | sed 's/\*\*\*/_/g' | sed 's/\*\*/_/g' | sed 's/\*/_/g' > clean_file.txt

example_file.txt
包含类似以下内容:

>SH1111056.09FU|KC881085_refs|k__Fungi;p__Ascomycota;c__Sordariomycetes;o__Hypocreales;f__Clavicipitaceae;g__Neotyphodium;s__Neotyphodium_siegelii;|foliar_endophyte*litter_saprotroph*class1_clavicipitaceous_endophyte**leaf/fruit/seed**non-aquatic*arthropod-associated*filamentous_mycelium******
>SH1115797.09FU|UDB031565_refs|k__Fungi;p__Basidiomycota;c__Agaricomycetes;o__Hymenochaetales;f__Hymenochaetaceae;g__Fomitiporia;s__Fomitiporia_hippophaeicola;|plant_pathogen*wood_saprotroph**wood_pathogen*wood*white_rot*non-aquatic**filamentous_mycelium*polyporoid*poroid****
>SH0879139.09FU|KF945456|k__Viridiplantae;p__Anthophyta;c__Eudicotyledonae;o__Lamiales;f__Acanthaceae;g__Ruellia;s__Ruellia_brandbergensis;|ND**************
>SH0991532.09FU|UDB07658019|k__Fungi;p__Ascomycota;c__Dothideomycetes;o__Venturiales;f__Venturiaceae;g__Sympodiella;s__Sympodiella_sp;|litter_saprotroph****leaf/fruit/seed**non-aquatic**filamentous_mycelium******
>SH0991546.09FU|UDB07657573|k__Fungi;p__Ascomycota;c__Dothideomycetes;o__Venturiales;f__Venturiaceae;g__Sympodiella;s__Sympodiella_sp;|litter_saprotroph****leaf/fruit/seed**non-aquatic**filamentous_mycelium******

编辑:

假设将

*
替换为
_
,预期输出将是这样的:

>SH1111056.09FU|KC881085_refs|k__Fungi;p__Ascomycota;c__Sordariomycetes;o__Hypocreales;f__Clavicipitaceae;g__Neotyphodium;s__Neotyphodium_siegelii;|foliar_endophyte_litter_saprotroph_class1_clavicipitaceous_endophyte_leaf/fruit/seed_non-aquatic_arthropod-associated_filamentous_mycelium_
>SH1115797.09FU|UDB031565_refs|k__Fungi;p__Basidiomycota;c__Agaricomycetes;o__Hymenochaetales;f__Hymenochaetaceae;g__Fomitiporia;s__Fomitiporia_hippophaeicola;|plant_pathogen_wood_saprotroph_wood_pathogen_wood_white_rot_non-aquatic_filamentous_mycelium_polyporoid_poroid_
>SH0879139.09FU|KF945456|k__Viridiplantae;p__Anthophyta;c__Eudicotyledonae;o__Lamiales;f__Acanthaceae;g__Ruellia;s__Ruellia_brandbergensis;|ND_
>SH0991532.09FU|UDB07658019|k__Fungi;p__Ascomycota;c__Dothideomycetes;o__Venturiales;f__Venturiaceae;g__Sympodiella;s__Sympodiella_sp;|litter_saprotroph_leaf/fruit/seed_non-aquatic_filamentous_mycelium_
>SH0991546.09FU|UDB07657573|k__Fungi;p__Ascomycota;c__Dothideomycetes;o__Venturiales;f__Venturiaceae;g__Sympodiella;s__Sympodiella_sp;|litter_saprotroph_leaf/fruit/seed_non-aquatic_filamentous_mycelium_
bash sed substitution
1个回答
0
投票

看看这个

cat example_file.txt | tr -s '*' '_' > clean_file.txt

输出

>SH1111056.09FU|KC881085_refs|k_Fungi;p_Ascomycota;c_Sordariomycetes;o_Hypocreales;f_Clavicipitaceae;g_Neotyphodium;s_Neotyphodium_siegelii;|foliar_endophyte_litter_saprotroph_class1_clavicipitaceous_endophyte_leaf/fruit/seed_non-aquatic_arthropod-associated_filamentous_mycelium_
>SH1115797.09FU|UDB031565_refs|k_Fungi;p_Basidiomycota;c_Agaricomycetes;o_Hymenochaetales;f_Hymenochaetaceae;g_Fomitiporia;s_Fomitiporia_hippophaeicola;|plant_pathogen_wood_saprotroph_wood_pathogen_wood_white_rot_non-aquatic_filamentous_mycelium_polyporoid_poroid_
>SH0879139.09FU|KF945456|k_Viridiplantae;p_Anthophyta;c_Eudicotyledonae;o_Lamiales;f_Acanthaceae;g_Ruellia;s_Ruellia_brandbergensis;|ND_
>SH0991532.09FU|UDB07658019|k_Fungi;p_Ascomycota;c_Dothideomycetes;o_Venturiales;f_Venturiaceae;g_Sympodiella;s_Sympodiella_sp;|litter_saprotroph_leaf/fruit/seed_non-aquatic_filamentous_mycelium_
>SH0991546.09FU|UDB07657573|k_Fungi;p_Ascomycota;c_Dothideomycetes;o_Venturiales;f_Venturiaceae;g_Sympodiella;s_Sympodiella_sp;|litter_saprotroph_leaf/fruit/seed_non-aquatic_filamentous_mycelium_
© www.soinside.com 2019 - 2024. All rights reserved.