以下是我的Nginx日志格式
log_format timed_combined '$http_x_forwarded_for - $remote_user [$time_local] ' '"$request" $status $body_bytes_sent ' '"$http_referer" "$http_user_agent" ' '$request_time $upstream_response_time $pipe';
以下是Nginx日志入口(供参考)
- - test.user [26/May/2017:21:54:26 +0000] "POST /elasticsearch/_msearch HTTP/1.1" 200 263 "https://myserver.com/app/kibana" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 0.020 0.008 .
以下是logstash grok模式
NGUSERNAME [a-zA-Z\.\@\-\+_%]+
NGUSER %{NGUSERNAME}
NGINXACCESS %{IPORHOST:clientip} - - \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{NUMBER:request_time} %{NUMBER:upstream_time}
在 logstash 日志中发现错误
"status"=>400, "error"=>{"type"=>"mapper_parsing_exception", “原因”=>“无法解析[时间戳]”, "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid 格式:\"26/May/2017:19:28:14 -0400\" 格式错误 \"/5月/2017:19:28:14 -0400\"
Issue: - Nginx logs are not getting grokked.
Requirement: - Timestamp should be filtered into a particular field.
我的配置有什么问题?如何修复此错误?
这是 NGINX
access.log
和 error.log
文件的模式。
filter {
############################# NGINX ##############################
if [event][module] == "nginx" {
########## access.log ##########
if [fileset][name] == "access" {
grok {
match => { "message" => ["%{IPORHOST:ip} - %{DATA:user_name} \[%{HTTPDATE:time}\] \"%{WORD:http_method} %{DATA:url} HTTP/%{NUMBER:http_version}\" %{NUMBER:response_code} %{NUMBER:body_sent_bytes} \"%{DATA:referrer}\" \"%{DATA:agent}\""] }
remove_field => "message"
}
date {
match => ["time", "dd/MMM/YYYY:HH:mm:ss Z"]
target => "@timestamp"
remove_field => "time"
}
useragent {
source => "agent"
target => "user_agent"
remove_field => "agent"
}
geoip {
source => "ip"
target => "geoip"
}
}
########## error.log ##########
else if [fileset][name] == "error" {
grok {
match => { "message" => ["%{DATA:time} \[%{DATA:log_level}\] %{NUMBER:pid}#%{NUMBER:tid}: (\*%{NUMBER:connection_id} )?%{GREEDYDATA:messageTmp}"] }
remove_field => "message"
}
date {
match => ["time", "YYYY/MM/dd HH:mm:ss"]
target => "@timestamp"
remove_field => "time"
}
mutate {
rename => {"messageTmp" => "message"}
}
}
grok {
remove_field => "[event]"
}
mutate {
add_field => {"serviceName" => "nginx"}
}
}
}
也适用于 Tomcat:https://gist.github.com/petrov9/4740c61459a5dcedcef2f27c7c2900fd
您提供的日志行与默认的
NGINXACCESS
grok 模式不匹配,因为有两个差异:
-
) 是第一个元素。-
)所以有两种方法可以解决这个问题:
NGINXACCESS - - %{USERNAME:username} \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{NUMBER:request_time} %{NUMBER:upstream_time}
我建议使用 Grok 调试器 来开发和调试 grok 模式。它允许您以增量方式创建和测试它们。