我想写一个python脚本,通过扫描一个文件,直到找到一个给定的单词,然后从这个单词中删除文件中的所有行,直到找到下一个给定的单词,如图所示。
Line1
Line2
Line3
Key-Word
Line4
Line5
Key-Word2
Line6
Line7
结果是:
Line1
Line2
Line3
Key-Word2
Line6
Line7
到目前为止,我已经成功地让它检测到了关键词,但我不知道如何让它删除所有的行,然后从关键词2开始继续。
读取文件的行数,然后把它写回来,除了从开始键到停止键的行数,下面是一个例子。
def erase(file_name: str, start_key: str, stop_key: str):
"""
This function will delete all line from the givin start_key
until the stop_key. (include: start_key) (exclude: stop_key)
"""
try:
# read the file lines
with open(file_name, 'r+') as fr:
lines = fr.readlines()
# write the file lines except the start_key until the stop_key
with open(file_name, 'w+') as fw:
# delete variable to control deletion
delete = False
# iterate over the file lines
for line in lines:
# check if the line is a start_key
# set delete to True (start deleting)
if line.strip('\n') == start_key:
delete = True
# check if the line is a stop_key
# set delete to False (stop deleting)
elif line.strip('\n') == stop_key:
delete = False
# write the line back based on delete value
# if the delete setten to True this will
# not be executed (the line will be skipped)
if not delete:
fw.write(line)
except RuntimeError as ex:
print(f"erase error:\n\t{ex}")
使用方法:
erase('file.txt', 'Key-Word', 'Key-Word2')
file.txt (输入):
Line1
Line2
Line3
Key-Word
Line4
Line5
Key-Word2
Line6
Line7
在运行该函数后,
Line1
Line2
Line3
Key-Word2
Line6
Line7
理想情况下,你应该打开文件两次。一次是读取行,一次是写入。如果由于某种原因,你在你的forloop中试图同时进行读和写时出错,你可能会得到一个部分受影响的文件。
你需要小心地删除行间的空白,因为字符"\n "会被附加到你的字符串中。
keyword_found = False
with open("line_file.txt", "r") as f:
lines = f.readlines()
with open("line_file.txt", "w") as f:
while (lines):
line = lines.pop(0).strip("\n")
if line == 'you':
keyword_found = True
if line == 'friend':
keyword_found = False
if not keyword_found:
f.write(line + "\n")