当脚本中断时，python读取和写入文件会添加意外的字符

Question

这是我的问题，我有一个脚本有很多步骤，基本上它打开一个文件，读取它，读取后它写回文件。脚本完成后一切都很好。排序异常或脚本中断时出现问题。我以'r +'模式打开文件，因为如果我在'w'模式下打开它，文件会立即变为空白，如果脚本被中断，它会保持空白，而我希望它保留以前的值。下面是一个示例，但不是我正在运行的确切脚本如果脚本被中断（或发生异常，即使它被处理），test.txt中的值将是“myVar = 13e”或“myVar = 13ne” 。不总是，但经常。为什么会发生这种情况以及如何避免它？

import time
from test import myVar
file_path = "./test.py"
with open(file_path, 'r+', encoding=‘utf-8’) as f:
    # read the file content which is for example “myVar=11”
    # do calculations with myVar
    #str_to_oc = "myVar="+str(row[0]) #row[0] is fetched from database, it’s ID of the record. It’s an integer
    str_to_oc = “myVar=“+str(13) # I hardcoded the 13 value here instead of the database row[0]
    time.sleep(3) #just adding a delay so you can interrupt easily
    # write back a string “myVar=13” which is he new value of 13
    f.write(str_to_oc)

编辑了代码示例以便于测试

Answer 1

你看到了缓冲效果。

你可以通过加入flush来降低效果：

    f.write(str_to_oc)
    f.flush()

CTRL / C异步到达，因此不会完全修复它。此外，如果您选择插入/删除，以便单个记录和整个文件大小发生变化，您将不满意旧+新记录的错位。

在幕后，io.BufferedWriter偶尔会要求原始的write，它变成操作系统级别的syscall。你说CTRL / C或致命堆栈跟踪导致程序提前终止。在这种情况下，整个python解释器进程退出，导致隐式close()，这可能导致从您的文件中读取旧+新字节的组合。请注意，多字节UTF8代码点可以跨越磁盘块，这可能会导致不快乐。

鉴于观察到程序的可靠性，听起来您可以很好地保持原始状态不变，直到处理成功完成：

tmp_path = file_path + '.tmp'
with open(file_path) as fin:
    with open(tmp_path, 'w') as fout:
        for line in fin:
            # (do stuff, compute output)
            fout.write(out_line + '\n')

os.rename(tmp_path, file_path)  # atomic operation, all-or-nothing

Answer 2

一个非常天真的解决方案就是将文件读入内存，假设它与您的示例所暗示的一样短，并在发生异常时重写文件内容。您还可以使用临时文件来避免破坏原始文件，然后仅在成功的情况下写入。

Answer 3

只是对于任何感兴趣的人，我做了一些hack-ish的事情并将注释附加到我写入文件的字符串..

str_to_oc = “myVar=“+str(13)+”#”

当脚本中断时，python读取和写入文件会添加意外的字符

问题描述投票：0回答：3

3个回答

最新问题

当脚本中断时，python读取和写入文件会添加意外的字符

问题描述 投票：0回答：3

3个回答

最新问题

问题描述投票：0回答：3