从 urllib3 记录器编辑敏感信息

问题描述 投票:0回答:1

我想对请求模块中使用的 urllib3 记录器应用过滤器,以便编辑所有日志字符串中的敏感信息。出于某种原因,当

requests.get()
.

调用时,我的过滤器未应用于 urllib3.connectionpool Logger

可重现的例子

import logging
import re
import requests


class Redactor(logging.Filter):
    """Filter subclass to redact patterns from logs."""
    redact_replacement_string = "<REDACTED_INFO>"

    def __init__(self, patterns: list[re.Pattern] = None):
        super().__init__()
        self.patterns = patterns or list()

    def filter(self, record: logging.LogRecord) -> bool:
        """
        Overriding the original filter method to redact, rather than filter.
        :return: Always true - i.e. always apply filter
        """
        for pattern in self.patterns:
            record.msg = pattern.sub(self.redact_replacement_string, record.msg)
        return True

# Set log level
urllib_logger = logging.getLogger("urllib3.connectionpool")
urllib_logger.setLevel("DEBUG")

# Add handler
handler = logging.StreamHandler()
handler.setFormatter(logging.Formatter("logger name: {name} | message: {message}", style="{"))
urllib_logger.addHandler(handler)

# Add filter
urllib_logger.info("Sensitive string before applying filter: www.google.com")
sensitive_patterns = [re.compile(r"google")]
redact_filter = Redactor(sensitive_patterns)
urllib_logger.addFilter(redact_filter)
urllib_logger.info("Sensitive string after applying filter: www.google.com")

# Perform a request that's supposed to use the filtered logger
requests.get("https://www.google.com")

# Check if the logger has been reconfigured
urllib_logger.info("Sensitive string after request: www.google.com")

这段代码的结果是Handler应用于所有的日志字符串,但是Filter没有应用于

requests.get()
函数发出的日志字符串:

logger name: urllib3.connectionpool | message: Sensitive string before applying filter: www.google.com
logger name: urllib3.connectionpool | message: Sensitive string after applying filter: www.<REDACTED_INFO>.com
logger name: urllib3.connectionpool | message: Starting new HTTPS connection (1): www.google.com:443
logger name: urllib3.connectionpool | message: https://www.google.com:443 "GET / HTTP/1.1" 200 None
logger name: urllib3.connectionpool | message: Sensitive string after request: www.<REDACTED_INFO>.com

我期待的

我希望敏感模式(“google”)在任何地方都被编辑:

logger name: urllib3.connectionpool | message: Sensitive string before applying filter: www.google.com
logger name: urllib3.connectionpool | message: Sensitive string after applying filter: www.<REDACTED_INFO>.com
logger name: urllib3.connectionpool | message: Starting new HTTPS connection (1): www.<REDACTED_INFO>.com:443
logger name: urllib3.connectionpool | message: https://www.<REDACTED_INFO>.com:443 "GET / HTTP/1.1" 200 None
logger name: urllib3.connectionpool | message: Sensitive string after request: www.<REDACTED_INFO>.com

我试过的

  1. 我尝试将相同的过滤器应用于“根”记录器、“urllib3”记录器和所有现有的记录器,并得到相同的结果(像这样):
all_loggers = [logger for logger in logging.root.manager.loggerDict.values()
               if not isinstance(logger, logging.PlaceHolder)]

for logger in all_loggers:
    logger.addFilter(redact_filter)
  1. 我尝试将 Filter 应用于 Handler,而不是 Logger,因为 Handler 似乎应用于所有日志字符串。仍然没有运气。

  2. 我知道我可以子类化 Formatter 并在其中进行编辑,但我认为格式化和编辑是两个不同的功能,我想将它们分开。另外,如果能理解生成我得到的结果的日志记录模块中的逻辑,那就太好了。

python python-requests urllib3 python-logging redaction
1个回答
0
投票

那是因为传递给你的过滤函数的记录还没有格式化。您要编辑的网址在

record.args
中。 您需要在构建结束消息后应用过滤器。

© www.soinside.com 2019 - 2024. All rights reserved.