我正在尝试将一个文本文件转换为该文件的另一个更有条理的版本。到目前为止,我已经能够在第一个函数中半想要的地方添加换行符。我获取一个输入文件并吐出一个输出。
在第二个函数中,我将第一个fn的输出用作输入。我正在尝试使用fileinput库,但是每次运行代码时,输出文件都是空白。我还尝试简单地替换文本,并且仍在输出空白文本文件。我在这里想念什么?
代码:
import re
def setup():
input = open(r'C:\Users\Phillipos Admasu\Documents\India\india_work.txt', 'r+')
output = open(r'C:\Users\Phillipos Admasu\Documents\India\india_work_final.txt', 'w+')
for line in input:
output.write(re.sub('\s', '\n', line)) # This makes a line break after every Data unit.
input.close()
output.close()
def edits():
input = open(r'C:\Users\Phillipos Admasu\Documents\India\india_work_final.txt', 'w+')
output2 = open(r'C:\Users\Phillipos Admasu\Documents\India\india_work_final.txt', 'w+')
for line in input:
output2.write(re.sub('blue', 'TEST', line))
input.close()
output2.close()
if __name__ == "__main__":
setup()
edits()
欢迎使用StackOverflow!就像Ann Zen在评论中所说的那样,我认为您的问题是因为在edits()
中您正在尝试从以'w+'
模式打开的文件中读取行,该文件将在之前截断(如果愿意,则为空)用它做任何事情。
如果您确实想以两种不同的功能修改此文件,我也建议使用中间文件。那会给像:
import re
def setup():
input_file_name = r'C:\Users\Phillipos Admasu\Documents\India\india_work.txt'
intermediate_file_name = r'C:\Users\Phillipos Admasu\Documents\India\india_work_inter.txt'
with open(input_file_name) as input_file:
with open(intermediate_file_name, 'w') as intermediate_file:
for current_line in input_file:
intermediate_file.write(re.sub(r'\s', '\n', current_line))
def edits():
intermediate_file_name = r'C:\Users\Phillipos Admasu\Documents\India\india_work_inter.txt'
output_file_name = r'C:\Users\Phillipos Admasu\Documents\India\india_work_final.txt'
with open(intermediate_file_name) as intermediate_file:
with open(output_file_name, "w") as output_file:
for current_line in intermediate_file:
output_file.write(re.sub(r'blue', 'TEST', current_line))
if __name__ == '__main__':
setup()
edits()
但是您可以使用io.StringIO
来使用内存中的临时缓冲区,而不是使用临时文件。这样会得到类似的结果:
io.StringIO
但是我更喜欢的解决方案是遍历文件,并为每一行做所有必要的修改,然后再将这一行写到输出文件中:
import re
import io
temporary_buffer = io.StringIO()
def setup():
input_file_name = r'C:\Users\Phillipos Admasu\Documents\India\india_work.txt'
with open(input_file_name) as input_file:
for current_line in input_file:
temporary_buffer.write(re.sub(r'\s', '\n', current_line))
def edits():
output_file_name = r'C:\Users\Phillipos Admasu\Documents\India\india_work_final.txt'
with open(output_file_name, "w") as output_file:
output_text = temporary_buffer.getvalue()
output_text = re.sub('blue', 'TEST', output_text)
output_file.write(output_text)
if __name__ == '__main__':
setup()
edits()
temporary_buffer.close()
顺便说一句,就像Karl Knechtel在评论中说的那样,不要为名称import re
input_file_name = r'C:\Users\Phillipos Admasu\Documents\India\india_work.txt'
output_file_name = r'C:\Users\Phillipos Admasu\Documents\India\india_work_final.txt'
matchers = (
(re.compile(r'\s'), '\n'),
(re.compile(r'blue'), 'TEST')
)
with open(input_file_name) as input_file:
with open(output_file_name, 'w') as output_file:
for line in input_file:
output_line = line
for current_matcher, replacement_string in matchers:
output_line = current_matcher.sub(replacement_string, output_line)
output_file.write(output_line)
分配值,因为它已经与标准库的input
相关联。您将失去对该功能的访问。其他可能会读取您的代码的人也会感到困惑。