txt 文件,在 R 中包含 FASTA 数据

问题描述 投票:0回答:0

我有一个 .txt 文件,其中包含 FASTA 格式的氨基酸序列(它包含行与行之间的空格):

Protein_A|maker-scaffold10x_349_pilon-augustus-gene-0.9 MHIHIRRFRPFDEYWKDVFASRIDTTKSLVKACEKLKLKFFINASAVGVVLAKDSNFIKSL YPSHRVGLGGALGSGQQWISWVHLDDVVRVVQFLIQPDCPVVSGPVNVTAPHAVRQAALS RCLSEIGAPCLPFGAPPTPSFVPRLLLGPYRATLVLDGQRVIPQKLLDAGFQFKYALLE DALHAIYGRRPSP 支付*

Protein_B|maker-scaffold10x_56_pilon-augustus-gene-0.90 MNRLLPVLSSAVLTRCVLHVQLRTIFTARTISRYPFQSLCDLPVRWKSKSKGKVQTTVLA RDQLPSELLEVIRADEMKTSFEATLARFQSALQQKLALKITPQMLADLTIPEARAAKLGQI ASLISQQEKHGASGQVTNQQLLIDLSARPHLVPAARKAVSQLLETTDHGSASLIESAGQ AAFTIRLRTVVTREAREELIHKGQEMLNQVKREMDRIYQTHDKLIPNTPEGKKHHSEDHL FAAREYLRSVVKSNHALAATIWEKKHAELSGP*

我想将它们导入数据框中的 R studio:

Name                                                    Sequence
Protein_A|maker-scaffold10x_349_pilon-augustus-gene-0.9 MHIHIRFRPFDEYWKDVFASRIDTTKSLVKACEKLKLKFFINASAVGVVLAKDSNFIKSLYPSHRVGLGGALGSGQQWISWVHLDDVVRVVQFLIQPDCPVVSGPVNVTAPHAVRQAALSRCLSEAIGAPCLPFGAPPTPSFVPRLLLGPYRATLVLDGQRVIPQKLLDAGFQFKYALLEDALHAIYGRRPSPPA*            
Protein_B|maker-scaffold10x_56_pilon-augustus-gene-0.90 MNRLLPVLSSAVLTRCVLHVQLRTIFTARTISRYPFQSLCDLPVRWKSKSKGKVQTTVLARDQLPSELLEVIRADEMKTSFEATLARFQSALQQKLALKITPQMLADLTIPEARAKLGQIASLISQQEKHGASGQVTNQQLLIDLSARPHLVPAARKAVSQLLETTDHGSASSLIESAGQAAFTIRLRTVVTREAREELIHKGQEMLNQVKREMDRIYQTHDKLIPNTPEGKKHHSEDHLFAAREYLRSVVKSNHALAATIWEKKHAELSGP*

我可以使用哪个包?

r dataframe txt fasta
© www.soinside.com 2019 - 2024. All rights reserved.