我有一个这样的txt文件:
GAACACGAAGGACGC
GAACACGAAGGACGC
GAACACGAAGGACGC
GAACACGAAGGACGC
GAACACGAAGGACGC
TCTAAGTAGTCAAAA
TCTAAGTAGTCAAAA
TCTAAGTAGTCAAAA
TCTAAGTAGTCAAAA
TCTAAGTAGTCAAAA
TCTAAGTAGTCAAAA
ACGGTGGGAATAAGA
ACGGTGGGAATAAGA
ACGGTGGGAATAAGA
ACGGTGGGAATAAGA
ACGGTGGGAATAAGA
GGGGCGATAATTTGC
GGGGCGATAATTTGC
GGGGCGATAATTTGC
GGGGCGATAATTTGC
GGGGCGATAATTTGC
GGGGCGATAATTTGC
我想过滤掉重复六次的序列并将它们保存在txt文件中。我怎样才能在Python上做到这一点?抱歉,如果这是一个愚蠢的问题。
提前谢谢您
我尝试过:
ids = open('IDs.txt', 'r')
for id in ids:
if id is ...
最大的问题是txt文件有超过100k个唯一序列..我无法一一进行..这就是为什么我被困在这里
from collections import Counter
lst = []
lst6 = []
with open(r'data.txt', 'r') as f:
for l in f:
lst.extend(l.split())
for k,v in Counter(lst).items():
if v == 6:
lst6.append(k)
with open(r'data6.txt', 'w') as f:
for x in lst6:
f.write(x)
f.write('\n')