我想要一个在 fluidd 中使用正则表达式来解析 nginx 错误日志。
示例行是:
2024/04/15 09:06:29 [error] 3443790#3443790: *176070165 limiting requests, excess: 2.957 by zone "RequestLimitForCommonApi", client: 77.81.151.129, server: test.com, request: "POST /capi/session/forgot HTTP/1.1", host: "test.com", referrer: "https://test.com/"
我使用以下格式来匹配日志参数:
format1 /^(?<time>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}) \[(?<log_level>\w+)\] (?<pid>\d+).(?<tid>\d+): (?<error>.*), (?<client>.*), (?<server>.*), (?<request>.*), (?<host>.*), (?<referrer>.*)/
但是有些日志行有“uptime”参数,而有些则没有。
现在我应该使用什么正则表达式来匹配“正常运行时间”参数值(如果存在)?
包含正常运行时间的日志行示例:
2024/04/15 02:01:32 [error] 3443790#3443790: *172976982 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 86.55.16.251, server: test.com, request: "POST /api/test HTTP/1.1", upstream: "http://127.0.0.1:30110/api/test", host: "test.com", referrer: "https://test.com/"
这样你就可以修改你的代码了。
/^(?<timestamp>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}) \[(?<log_level>\w+)\] (?<pid>\d+)#(?<tid>\d+): \*(?<connection>\d+) (?<message>.+?)(?:, uptime: (?<uptime>\d+\.\d+))?, client: (?<client_ip>\d+\.\d+\.\d+\.\d+), server: (?<server>\S+), request: "(?<request_method>\w+) (?<request>[^"]+) HTTP\/(?<http_version>\d\.\d)", host: "(?<host>[^"]+)", referrer: "(?<referrer>[^"]+)"
尝试这个代码。如果有正常运行时间,它将填满它,如果没有,则不会。
例如,您可以尝试使用此代码来匹配客户端请求,如果它们之间存在另一个诸如“test”之类的短语,则可以忽略它。
客户端: (?[^,]+),( 测试: )?((?[^,]+),)?请求:(?[^,]+)
正如您所注意到的,“test:”及其后面的内容是可选的。