gunicorn访问日志格式

Question

我计划在 kubernetes 上通过 Gunicorn 运行 Flask。为了正确记录日志，我想以 json 格式输出所有日志。

目前我正在使用 minikube 和 https://github.com/inovex/kubernetes-logging 进行测试，以流畅地收集日志。

我设法使错误日志（回溯）格式正确，这要归功于：使用 Flask 和 Gunicorn 进行 JSON 格式的日志记录

我仍在为访问日志格式而苦苦挣扎。我指定了以下gunicorn访问日志格式：

access_log_format = '{"remote_ip":"%(h)s","request_id":"%({X-Request-Id}i)s","response_code":"%(s)s","request_method":"%(m)s","request_path":"%(U)s","request_querystring":"%(q)s","request_timetaken":"%(D)s","response_length":"%(B)s"}'

生成的日志是json格式的。但是消息部分（基于 access_log_format 的格式）现在包含转义双引号，并且不会被 Fluentd / ELK 解析到其单独的字段中

{"tags": [], "timestamp": "2017-12-07T11:50:20.362559Z", "level": "INFO", "host": "ubuntu", "path": "/usr/local/lib/python2.7/dist-packages/gunicorn/glogging.py", "message": "{\"remote_ip\":\"127.0.0.1\",\"request_id\":\"-\",\"response_code\":\"200\",\"request_method\":\"GET\",\"request_path\":\"/v1/records\",\"request_querystring\":\"\",\"request_timetaken\":\"19040\",\"response_length\":\"20\"}", "logger": "gunicorn.access"}

谢谢 Jpw

Answer 1

最简单的解决方案是将外部单引号更改为双引号，内部双引号更改为单引号，如下所述。

--access-logformat  "{'remote_ip':'%(h)s','request_id':'%({X-Request-Id}i)s','response_code':'%(s)s','request_method':'%(m)s','request_path':'%(U)s','request_querystring':'%(q)s','request_timetaken':'%(D)s','response_length':'%(B)s'}"

以下是示例日志

{'remote_ip':'127.0.0.1','request_id':'-','response_code':'404','request_method':'GET','request_path':'/test','request_querystring':'','request_timetaken':'6642','response_length':'233'}
{'remote_ip':'127.0.0.1','request_id':'-','response_code':'200','request_method':'GET','request_path':'/','request_querystring':'','request_timetaken':'881','response_length':'20'}

Answer 2

我正在寻找在日志配置文件中有用的东西。另外，我不喜欢手动构建 json 格式。
解决方案： Gunicorn 的所有“日志记录参数”都可以在记录的参数中找到。因此，让我们从那里获取字段，并让 pythonjsonlogger 为我们完成其余的工作。格式化类

from pythonjsonlogger.jsonlogger import JsonFormatter, merge_record_extra class GunicornLogFormatter(JsonFormatter): def add_fields(self, log_record, record, message_dict): """ This method allows us to inject gunicorn's args as fields for the formatter """ super(GunicornLogFormatter, self).add_fields(log_record, record, message_dict) for field in self._required_fields: if field in self.rename_fields: log_record[self.rename_fields[field]] = record.args.get(field) else: log_record[field] = record.args.get(field)

示例日志记录配置文件

{ "version": 1, "disable_existing_loggers": false, "formatters": { "gunicorn_json": { "()": "GunicornLogFormatter", "format": "%(h)s %(r)s %({x-request-id}i)s", "datefmt": "%Y-%m-%dT%H:%M:%S%z", "rename_fields": { "{x-request-id}i": "request_id", "r": "request" } } }, "handlers": { "json-gunicorn-console": { "class": "logging.StreamHandler", "level": "INFO", "formatter": "gunicorn_json", "stream": "ext://sys.stdout" } }, "loggers": { "gunicorn.access": { "level": "INFO", "handlers": [ "json-gunicorn-console" ] } } }

示例日志

{"h": "127.0.0.1", "request": "GET /login?next=/ HTTP/1.1", "request_id": null} {"h": "127.0.0.1", "request": "GET /static/css/style.css HTTP/1.1", "request_id": null} {"h": "127.0.0.1", "request": "GET /some/random/path HTTP/1.1", "request_id": null} {"h": "127.0.0.1", "request": "GET /some/random/path HTTP/1.1", "request_id": "123123123123123123"}

Answer 3

\"

的值中转义双引号 (

--access-logformat

)，以使日志保持为有效的 JSON。

因此，如果您在 Docker 容器中运行 Gunicorn，您的

Dockerfile

可能会以如下内容结尾： CMD ["gunicorn", \ "-b", "0.0.0.0:5000", \ "--access-logfile", "-",\ "--access-logformat", "{\"remote_ip\":\"%(h)s\",\"request_id\":\"%({X-Request-Id}i)s\",\"response_code\":\"%(s)s\",\"request_method\":\"%(m)s\",\"request_path\":\"%(U)s\",\"request_querystring\":\"%(q)s\",\"request_timetaken\":\"%(D)s\",\"response_length\":\"%(B)s\"}", \ "app:create_app()"]

在

此处

查找其余的 Gunicorn 日志记录选项。

Answer 4

在gunicorn配置文件中使用您的示例时

access_log_format = '{"remote_ip":"%(h)s","request_id":"%({X-Request-Id}i)s","response_code":"%(s)s","request_method":"%(m)s","request_path":"%(U)s","request_querystring":"%(q)s","request_timetaken":"%(D)s","response_length":"%(B)s"}'

我得到了将其读取为 json 并将其与 fluidd json 数据合并的所需行为，但是，gunicorn 字段未填充

{"tags": [], "level": "INFO", "host": "ubuntu", "logger": "gunicorn.access", "remote_ip":"%(h)s","request_id":"%({X-Request-Id}i)s","response_code":"%(s)s","request_method":"%(m)s","request_path":"%(U)s","request_querystring":"%(q)s","request_timetaken":"%(D)s","response_length":"%(B)s"}

看起来原因是 Gunicorn 将

access_log_format

作为消息传递给记录器，并将所有参数 (

safe_atoms

) 作为附加参数，例如

/gunicorn/glogging.py

safe_atoms = self.atoms_wrapper_class( self.atoms(resp, req, environ, request_time) ) try: # safe_atoms = {"s": "200", "m": "GET", ...} self.access_log.info(self.cfg.access_log_format, safe_atoms)

但是，如果

FluentRecordFormatter

将字符串视为有效的 json，它将使用

json.loads

读取它，但会忽略传递的任何参数

/flutter/handler.py

def _format_msg_json(self, record, msg): try: json_msg = json.loads(str(msg)) # <------- doesn't merge params if isinstance(json_msg, dict): return json_msg else: return self._format_msg_default(record, msg) except ValueError: return self._format_msg_default(record, msg)

将此与

默认 Python 格式化程序

进行比较，后者调用 record.message = record.getMessage()，后者又将参数合并到

/Lib/logging/

init.py def getMessage(self): """ Return the message for this LogRecord. Return the message for this LogRecord after merging any user-supplied arguments with the message. """ msg = str(self.msg) if self.args: msg = msg % self.args # <------ args get merged in return msg

我已经

记录了 Fluent-logger-python 项目的问题

。解决方法

使用

日志过滤器

在传递到FluentRecordFormatter之前执行合并。

logger = logging.getLogger('fluent.test')

class ContextFilter(logging.Filter):
    def filter(self, record):
        record.msg = record.msg % record.args
        return True

fluent_handler = handler.FluentHandler('app.follow', host='localhost', port=24224)
formatter = handler.FluentRecordFormatter()
fluent_handler.setFormatter(formatter)
merge_filter = ContextFilter()
fluent_handler.addFilter(merge_filter)
logger.addHandler(fluent_handler)

编辑：日志过滤器不起作用

使用日志过滤器的解决方法一段时间后，我开始收到类似的错误

ValueError: unsupported format character ';' (0x3b) at index 166

事实证明

FluentRecordFormatter

确实调用了基本

getMessage

实现，将参数合并到消息中

    def format(self, record):
        # Compute attributes handled by parent class.
        super(FluentRecordFormatter, self).format(record)  # <------ record.messge = record.msg % record.args
        # Add ours
        record.hostname = self.hostname

        # Apply format
        data = self._formatter(record)

        self._structuring(data, record)
        return data

问题在于

_format_msg_json(self, record, msg)

使用

record.msg

属性，即

unmerged

数据，而 record.message 是合并后的数据。这会产生一个问题，即我的日志过滤器正在合并/格式化数据，但日志格式化程序也尝试这样做，并且偶尔会看到无效的语法。

解决方法 2：不要使用 Json

我已经完全放弃从gunicorn / python 日志记录输出json。相反，我使用 Fluentd 的解析器来解析 json，例如

<filter *.gunicorn.access> @type parser key_name message reserve_time true reserve_data true remove_key_name_field true hash_value_field access_log <parse> @type regexp expression /^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*) "(?<referer>[^\"]*)" "(?<agent>[^\"]*)"$/ time_format %d/%b/%Y:%H:%M:%S %z </parse> </filter>

您可以在此处了解这些选项的作用：

https://docs. Fluentd.org/filter/parser

gunicorn访问日志格式

问题描述投票：0回答：4

4个回答

使用

使用日志过滤器的解决方法一段时间后，我开始收到类似的错误

我已经完全放弃从gunicorn / python 日志记录输出json。相反，我使用 Fluentd 的解析器来解析 json，例如

最新问题

gunicorn访问日志格式

问题描述 投票：0回答：4

4个回答

使用

使用日志过滤器的解决方法一段时间后，我开始收到类似的错误

我已经完全放弃从gunicorn / python 日志记录输出json。相反，我使用 Fluentd 的解析器来解析 json，例如

最新问题

问题描述投票：0回答：4