我想对请求模块中使用的 urllib3 记录器应用过滤器,以便编辑所有日志字符串中的敏感信息。出于某种原因,当
requests.get()
. 调用时,我的过滤器未应用于 urllib3.connectionpool Logger
import logging
import re
import requests
class Redactor(logging.Filter):
"""Filter subclass to redact patterns from logs."""
redact_replacement_string = "<REDACTED_INFO>"
def __init__(self, patterns: list[re.Pattern] = None):
super().__init__()
self.patterns = patterns or list()
def filter(self, record: logging.LogRecord) -> bool:
"""
Overriding the original filter method to redact, rather than filter.
:return: Always true - i.e. always apply filter
"""
for pattern in self.patterns:
record.msg = pattern.sub(self.redact_replacement_string, record.msg)
return True
# Set log level
urllib_logger = logging.getLogger("urllib3.connectionpool")
urllib_logger.setLevel("DEBUG")
# Add handler
handler = logging.StreamHandler()
handler.setFormatter(logging.Formatter("logger name: {name} | message: {message}", style="{"))
urllib_logger.addHandler(handler)
# Add filter
urllib_logger.info("Sensitive string before applying filter: www.google.com")
sensitive_patterns = [re.compile(r"google")]
redact_filter = Redactor(sensitive_patterns)
urllib_logger.addFilter(redact_filter)
urllib_logger.info("Sensitive string after applying filter: www.google.com")
# Perform a request that's supposed to use the filtered logger
requests.get("https://www.google.com")
# Check if the logger has been reconfigured
urllib_logger.info("Sensitive string after request: www.google.com")
这段代码的结果是Handler应用于所有的日志字符串,但是Filter没有应用于
requests.get()
函数发出的日志字符串:
logger name: urllib3.connectionpool | message: Sensitive string before applying filter: www.google.com
logger name: urllib3.connectionpool | message: Sensitive string after applying filter: www.<REDACTED_INFO>.com
logger name: urllib3.connectionpool | message: Starting new HTTPS connection (1): www.google.com:443
logger name: urllib3.connectionpool | message: https://www.google.com:443 "GET / HTTP/1.1" 200 None
logger name: urllib3.connectionpool | message: Sensitive string after request: www.<REDACTED_INFO>.com
我希望敏感模式(“google”)在任何地方都被编辑:
logger name: urllib3.connectionpool | message: Sensitive string before applying filter: www.google.com
logger name: urllib3.connectionpool | message: Sensitive string after applying filter: www.<REDACTED_INFO>.com
logger name: urllib3.connectionpool | message: Starting new HTTPS connection (1): www.<REDACTED_INFO>.com:443
logger name: urllib3.connectionpool | message: https://www.<REDACTED_INFO>.com:443 "GET / HTTP/1.1" 200 None
logger name: urllib3.connectionpool | message: Sensitive string after request: www.<REDACTED_INFO>.com
all_loggers = [logger for logger in logging.root.manager.loggerDict.values()
if not isinstance(logger, logging.PlaceHolder)]
for logger in all_loggers:
logger.addFilter(redact_filter)
我尝试将 Filter 应用于 Handler,而不是 Logger,因为 Handler 似乎应用于所有日志字符串。仍然没有运气。
我知道我可以子类化 Formatter 并在其中进行编辑,但我认为格式化和编辑是两个不同的功能,我想将它们分开。另外,如果能理解生成我得到的结果的日志记录模块中的逻辑,那就太好了。
那是因为传递给你的过滤函数的记录还没有格式化。您要编辑的网址在
record.args
中。
您需要在构建结束消息后应用过滤器。