我有一个正在处理的多线程 Python 应用程序(在 Windows 10 x64、21H2 和 Python 3.11.1 上进行测试),我在日志中看到重复的消息。起初我以为它有一个守护线程没有被清理但是使用非守护线程并显式关闭我的日志记录处理程序并没有解决它。在没有任何变化地产生新线程之前,我尝试了
.join()
ing 线程。
消息看起来像这样:
Hello
thread1:23460:18652 2023-04-27 11:18:25,569 - threadTest - WARNING: Hello
thread1 is dead, remove from threads = {'thread1': <Thread(thread1, stopped 23460)>}
Hello
thread2:21108:18652 2023-04-27 11:18:30,580 - threadTest - WARNING: Hello
thread2:21108:18652 2023-04-27 11:18:30,580 - threadTest - WARNING: Hello
thread2 is dead, remove from threads = {'thread2': <Thread(thread2, stopped 21108)>}
...(and so on with three entries for thread3 etc)...
重现消息的代码如下。如果我添加一个
fileHandler
(即日志文件中每条消息的多个副本),我会看到相同的行为:
import time
import threading
import logging
import logging.config
def thread_worker():
name = "threadTest"
logger = logging.getLogger(name)
log_formatter = logging.Formatter(fmt='%(threadName)s:%(thread)d:%(process)d %(asctime)s - %(name)s - %(levelname)s: %(message)s')
console_handler = logging.StreamHandler()
console_handler.setFormatter(log_formatter)
logger.addHandler(console_handler)
print("Hello")
logger.warning("Hello")
time.sleep(2)
for handler in logger.handlers: # Maybe a duplicate handler? Let's close
handler.close()
return
def main():
threads: dict[str, threading.Thread] = dict()
i = 1
while True:
thread_test = threading.Thread(target = thread_worker, name = f"thread{str(i)}")
threads[f"thread{str(i)}"] = thread_test
thread_test.start()
time.sleep(5)
dead_threads: list[str] = list()
for thread_name, thread in threads.items():
if not thread.is_alive():
print(f"{thread_name} is dead, remove from {threads = }")
thread.join()
dead_threads.append(thread_name)
for thread_name in dead_threads:
_ = threads.pop(thread_name)
i += 1
if __name__ == "__main__":
main()
我希望有人能看到我确定是我遗漏的明显问题!
logging
模块有一个记录器层次结构,这样错误就可以从库/API/等传播出去(或者不使用 propagate
属性)和所以有一个跨线程的根记录器。任何记录器也可以附加任意数量的给定处理程序类型。结合这两个,我的代码尽职尽责地获取名为threadTest
的记录器,然后向其添加另一个处理程序(正如@TimRoberts在我写这个答案时在评论中指出的那样)。
更改
for handler in logger.handlers:
部分以打印出每个处理程序:
for handler in logger.handlers:
print(f"{handler = }, {type(handler)}")
我们受到欢迎:
Hello
thread1:28572:25364 2023-04-27 13:41:06,847 - threadTest - WARNING: Hello
handler = <StreamHandler <stderr> (NOTSET)>
thread1 is dead, remove from threads = {'thread1': <Thread(thread1, stopped 28572)>}
Hello
thread2:26680:25364 2023-04-27 13:41:11,863 - threadTest - WARNING: Hello
thread2:26680:25364 2023-04-27 13:41:11,863 - threadTest - WARNING: Hello
handler = <StreamHandler <stderr> (NOTSET)>
handler = <StreamHandler <stderr> (NOTSET)>
🤦u200d♂️ 实际的应用程序被分成很多文件,日志记录作为一个模块引入,所以我没有考虑
logging
是如何工作的。最小的例子让它变得更加清晰,那是我查阅logging
文档的时候。
我的最终修复是确保在线程中运行的模块清理时删除处理程序或检查给定的处理程序类型是否已经存在(我可以摆脱这个,因为架构设计为每种类型只有一个处理程序) .