我有一个包含目录和文件的列表,我只想保留文件并编写以下代码对其进行过滤。但是,我发现某些记录仍在此列表中,例如'wangshx / ,
''`。
根据此结果,我发现if
句子可能有问题。谁能指出问题出在哪里?
In [7]: filelist = ['/tmp/test2.pbs', '/tmp/test.pbs', '/public/home/
...: wangshx:', ' ', 'correct_order.txt', 'download/', 'filepaths.
...: RData', 'lib/', 'Log.out', 'ncbi_error_report.xml', 'new_hg19
...: .1.bt2', 'new_hg19.2.bt2', 'new_hg19.3.bt2', 'new_hg19.4.bt2'
...: , 'new_hg19.fa', 'new_hg19.rev.1.bt2', 'new_hg19.rev.2.bt2',
...: 'perl5/', 'practice/', 'repnames_nfragments.txt', 'soft/', 's
...: ongmf/', 'sort.pbs', 'test.pbs', 'test.pbs.o1167575', 'test.p
...: bs.o1167590', 'tmp/', 'wangshx/', 'workspace/', 'wt/', 'wx/',
...: '']
In [8]: len(filelist)
Out[8]: 32
In [9]: for f in filelist:
...: print(f)
...:
...:
/tmp/test2.pbs
/tmp/test.pbs
/public/home/wangshx:
correct_order.txt
download/
filepaths.RData
lib/
Log.out
ncbi_error_report.xml
new_hg19.1.bt2
new_hg19.2.bt2
new_hg19.3.bt2
new_hg19.4.bt2
new_hg19.fa
new_hg19.rev.1.bt2
new_hg19.rev.2.bt2
perl5/
practice/
repnames_nfragments.txt
soft/
songmf/
sort.pbs
test.pbs
test.pbs.o1167575
test.pbs.o1167590
tmp/
wangshx/
workspace/
wt/
wx/
In [10]: for f in filelist:
...: print(f)
...: if f[-1]=='/' or f[-1]==':' or f=='' or f==' ':
...: print("=> Should remove " + f)
...: filelist.remove(f)
...:
/tmp/test2.pbs
/tmp/test.pbs
/public/home/wangshx:
=> Should remove /public/home/wangshx:
correct_order.txt
download/
=> Should remove download/
lib/
=> Should remove lib/
ncbi_error_report.xml
new_hg19.1.bt2
new_hg19.2.bt2
new_hg19.3.bt2
new_hg19.4.bt2
new_hg19.fa
new_hg19.rev.1.bt2
new_hg19.rev.2.bt2
perl5/
=> Should remove perl5/
repnames_nfragments.txt
soft/
=> Should remove soft/
sort.pbs
test.pbs
test.pbs.o1167575
test.pbs.o1167590
tmp/
=> Should remove tmp/
workspace/
=> Should remove workspace/
wx/
=> Should remove wx/
In [11]: filelist
Out[11]:
['/tmp/test2.pbs',
'/tmp/test.pbs',
' ',
'correct_order.txt',
'filepaths.RData',
'Log.out',
'ncbi_error_report.xml',
'new_hg19.1.bt2',
'new_hg19.2.bt2',
'new_hg19.3.bt2',
'new_hg19.4.bt2',
'new_hg19.fa',
'new_hg19.rev.1.bt2',
'new_hg19.rev.2.bt2',
'practice/',
'repnames_nfragments.txt',
'songmf/',
'sort.pbs',
'test.pbs',
'test.pbs.o1167575',
'test.pbs.o1167590',
'wangshx/',
'wt/',
'']
最佳,
世香
问题是在迭代列表时编辑列表。请改用列表理解。尚不清楚您的过滤器要求是什么,但是作为示例,以下示例将构建一个新列表,并删除以斜杠结尾的所有内容:
filelist = ['/tmp/test2.pbs', '/tmp/test.pbs', '/public/home/wangshx:', ' ', 'correct_order.txt', 'download/',
'filepaths.RData', 'lib/', 'Log.out', 'ncbi_error_report.xml', 'new_hg19.1.bt2', 'new_hg19.2.bt2',
'new_hg19.3.bt2', 'new_hg19.4.bt2', 'new_hg19.fa', 'new_hg19.rev.1.bt2', 'new_hg19.rev.2.bt2',
'perl5/', 'practice/', 'repnames_nfragments.txt', 'soft/', 'songmf/', 'sort.pbs', 'test.pbs',
'test.pbs.o1167575', 'test.pbs.o1167590', 'tmp/', 'wangshx/', 'workspace/', 'wt/', 'wx/', '']
files = [file for file in filelist if not file.endswith('/')]
print(files)
输出:
['/tmp/test2.pbs', '/tmp/test.pbs', '/public/home/wangshx:', ' ', 'correct_order.txt', 'filepaths.RData', 'Log.out', 'ncbi_error_report.xml', 'new_hg19.1.bt2', 'new_hg19.2.bt2', 'new_hg19.3.bt2', 'new_hg19.4.bt2', 'new_hg19.fa', 'new_hg19.rev.1.bt2', 'new_hg19.rev.2.bt2', 'repnames_nfragments.txt', 'sort.pbs', 'test.pbs', 'test.pbs.o1167575', 'test.pbs.o1167590', '']
根据您的列表,看起来可以像在每个字符串中查找字符.
一样简单。
类似这样的东西:
filelist = ['/tmp/test2.pbs', '/tmp/test.pbs', '/public/home/wangshx:', ' ', 'correct_order.txt', 'download/', 'filepaths.RData', 'lib/', 'Log.out', 'ncbi_error_report.xml', 'new_hg19.1.bt2', 'new_hg19.2.bt2', 'new_hg19.3.bt2', 'new_hg19.4.bt2' , 'new_hg19.fa', 'new_hg19.rev.1.bt2', 'new_hg19.rev.2.bt2', 'perl5/', 'practice/', 'repnames_nfragments.txt', 'soft/', 'songmf/', 'sort.pbs', 'test.pbs', 'test.pbs.o1167575', 'test.pbs.o1167590', 'tmp/', 'wangshx/', 'workspace/', 'wt/', 'wx/', ''] for f in filelist: if '.' in f: print(f) else: print("=> Should remove " + f)
将输出:
/tmp/test2.pbs
/tmp/test.pbs
=> Should remove /public/home/wangshx:
=> Should remove
correct_order.txt
=> Should remove download/
filepaths.RData
=> Should remove lib/
Log.out
ncbi_error_report.xml
new_hg19.1.bt2
new_hg19.2.bt2
new_hg19.3.bt2
new_hg19.4.bt2
new_hg19.fa
new_hg19.rev.1.bt2
new_hg19.rev.2.bt2
=> Should remove perl5/
=> Should remove practice/
repnames_nfragments.txt
=> Should remove soft/
=> Should remove songmf/
sort.pbs
test.pbs
test.pbs.o1167575
test.pbs.o1167590
=> Should remove tmp/
=> Should remove wangshx/
=> Should remove workspace/
=> Should remove wt/
=> Should remove wx/
=> Should remove