logstash 的 Nginx grok 模式

问题描述 投票:0回答:2

以下是我的Nginx日志格式

log_format timed_combined '$http_x_forwarded_for - $remote_user [$time_local] ' '"$request" $status $body_bytes_sent ' '"$http_referer" "$http_user_agent" ' '$request_time $upstream_response_time $pipe';

以下是Nginx日志入口(供参考)

- - test.user [26/May/2017:21:54:26 +0000] "POST /elasticsearch/_msearch HTTP/1.1" 200 263 "https://myserver.com/app/kibana" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 0.020 0.008 .

以下是logstash grok模式

NGUSERNAME [a-zA-Z\.\@\-\+_%]+
NGUSER %{NGUSERNAME}
NGINXACCESS %{IPORHOST:clientip} - - \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{NUMBER:request_time} %{NUMBER:upstream_time}

在 logstash 日志中发现错误

"status"=>400, "error"=>{"type"=>"mapper_parsing_exception", “原因”=>“无法解析[时间戳]”, "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid 格式:\"26/May/2017:19:28:14 -0400\" 格式错误 \"/5月/2017:19:28:14 -0400\"

Issue: - Nginx logs are not getting grokked. 
Requirement: - Timestamp should be filtered into a particular field.

我的配置有什么问题?如何修复此错误?

nginx logstash elastic-stack logstash-grok logstash-configuration
2个回答
2
投票

这是 NGINX

access.log
error.log
文件的模式。

filter {

############################# NGINX ##############################
  if [event][module] == "nginx" {

########## access.log ##########
    if [fileset][name] == "access" {
      grok {
        match => { "message" => ["%{IPORHOST:ip} - %{DATA:user_name} \[%{HTTPDATE:time}\] \"%{WORD:http_method} %{DATA:url} HTTP/%{NUMBER:http_version}\" %{NUMBER:response_code} %{NUMBER:body_sent_bytes} \"%{DATA:referrer}\" \"%{DATA:agent}\""] }
        remove_field => "message"
      }
      date {
        match => ["time", "dd/MMM/YYYY:HH:mm:ss Z"]
        target => "@timestamp"
        remove_field => "time"
      }
      useragent {
        source => "agent"
        target => "user_agent"
        remove_field => "agent"
      }
      geoip {
        source => "ip"
        target => "geoip"
      }
    }

########## error.log ##########
    else if [fileset][name] == "error" {
      grok {
        match => { "message" => ["%{DATA:time} \[%{DATA:log_level}\] %{NUMBER:pid}#%{NUMBER:tid}: (\*%{NUMBER:connection_id} )?%{GREEDYDATA:messageTmp}"] }
        remove_field => "message"
      }
      date {
        match => ["time", "YYYY/MM/dd HH:mm:ss"]
        target => "@timestamp"
        remove_field => "time"
      }

      mutate {
        rename => {"messageTmp" => "message"}
      }
    }

    grok {
      remove_field => "[event]"
    }

    mutate {
      add_field => {"serviceName" => "nginx"}
    }
  }
}

也适用于 Tomcat:https://gist.github.com/petrov9/4740c61459a5dcedcef2f27c7c2900fd


1
投票

您提供的日志行与默认的

NGINXACCESS
grok 模式不匹配,因为有两个差异:

  1. 作为日志行中的第一个元素,需要 ip 地址或主机名,但在您的日志行中,破折号 (
    -
    ) 是第一个元素。
  2. 日志行中的第三个元素是用户名,但 grok 模式需要破折号 (
    -
    )

所以有两种方法可以解决这个问题:

  1. 确保您的日志行与默认模式匹配
  2. 将 grok 模式更改为如下内容:

NGINXACCESS - - %{USERNAME:username} \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{NUMBER:request_time} %{NUMBER:upstream_time}

我建议使用 Grok 调试器 来开发和调试 grok 模式。它允许您以增量方式创建和测试它们。

© www.soinside.com 2019 - 2024. All rights reserved.