我有一个文件夹 "A",里面有很多文件(比如100个)。我想打开所有这些文件(都是文本文件),然后计算 "虚拟内存 "这个词在所有文件中出现的次数[可以是总和,也可以是每个文件中出现的次数]我试过这样的方法,但无法实现。
path = 'MY_PATH'
count=0
filecount=0
files = []
# r=root, d=directories, f = files
for r, d, f in os.walk(path):
for file in f:
files.append(os.path.join(r, file))
print(files)
for fileList in files:
with open(fileList, "r") as f:
# text = f.read()
# print(len(text))
print('OPENING FILE: ',f)
for word in f:
#print(word)
if(word == 'virtual memory'):
print('WORD FOUND')
count+=1
print("COUNT : ", count)
有没有什么快速的脚本,我可以用它来执行上面的查询,或者我需要做一些修正?先谢谢你
你可以用is模块很容易地创建文件列表,就像这样。
listfiles = os.listdir('path/to/files/')
然后你就可以在这个列表上循环读取整个文件,而不需要像这样循环。
count = [ ]
for file in listfiles:
with open(file) as f:
lines = f.readlines()
count.append(sum(lines == 'virtual memory')
这样一来,这个列表 计数 包含了每个文件中 "虚拟内存 "字符串的出现次数。
你正在做的循环与 for word in f
是对行的循环。当你打开一个文件时,你会对它的行进行迭代。
使用 file.count
数一数二 txt
文件。这里有一个简单的实现方法,你可以做到这一点。
import os
path = 'MY_PATH'
count= 0
for root, dirs, files in os.walk(path):
for file in files:
num=0
with open(os.path.join(root, file),"r") as f:
f_reader =f.read()
team = 'virtual memory'
num = f_reader.count(team)
count+=num
print('OPENING FILE: ',file, ' - Count:', num)
print("COUNT : ", count)
试试这个
for r, d, f in os.walk(path):
for file in f:
files.append(os.path.join(r, file))
# print(files)
# moving this for loop outside
# previously you were visiting each file more than once
for fileList in files:
with open(fileList, "r") as f:
print('OPENING FILE: ',f)
lines = []
for line in f:
lines.extend(line.strip().split(" "))
for idx in range(len(lines)-1):
if lines[idx] == 'virtual' and lines[idx+1] == 'memory':
count += 1
print("COUNT : ", count)
你的脚本失败了,因为 word
其实是 行. 以下是可以工作的:
with open(fileList, "r") as f:
for sentence in f:
count += sentence.count('virtual memory')