FluentBit 无法连接 K8s 节点上运行的所有 pod 的堆栈跟踪日志,它仅适用于节点上的单个 pod

问题描述 投票:0回答:2

我们一直在尝试为 k8s 集群上的日志聚合配置 fluent bit。我们正在使用 newRelic bundle helm charts 来实现这一点。 newRelic bundle 在每个 K8s 集群节点上创建一个 pod,并根据定义的配置处理日志。 除了堆栈跟踪串联之外,一切似乎都工作正常。 问题如下:

我们在单个节点上运行 4 个 pod,在运行时它们会在“/var/log/containers”目录下创建以下日志文件。

myapp-svc1-<pod-id>.log
myapp-svc2-<pod-id>.log
myapp-svc3-<pod-id>.log
myapp-svc4-<pod-id>.log

这是配置:

  fluent-bit.conf: |
    [SERVICE]
        Flush         1
        Log_Level     ${LOG_LEVEL}
        Daemon        off
        Parsers_File  parsers.conf
        HTTP_Server   On
        HTTP_Listen   0.0.0.0
        HTTP_Port     2020
    
    [INPUT]
        Name              tail
        Tag               kube.*
        Path              ${PATH}
        Parser            ${LOG_PARSER}
        DB                ${FB_DB}
        Mem_Buf_Limit     7MB
        Skip_Long_Lines   On
        Refresh_Interval  10

    [FILTER]
        Name                      multiline
        Match                     *
        multiline.key_content     log
        multiline.parser          multiline-regex-error-trace

    [FILTER]
        Name           kubernetes
        Match          kube.*
        # We need the full DNS suffix as Windows only supports resolving names with this suffix
        # See: https://kubernetes.io/docs/setup/production-environment/windows/intro-windows-in-kubernetes/#dns-limitations
        Kube_URL       https://kubernetes.default.svc.cluster.local:443
        Buffer_Size    ${K8S_BUFFER_SIZE}
        K8S-Logging.Exclude ${K8S_LOGGING_EXCLUDE}
    
    [FILTER]
        Name           record_modifier
        Match          *
        Record         cluster_name ${CLUSTER_NAME}
        Allowlist_key  container_name
        Allowlist_key  namespace_name
        Allowlist_key  pod_name
        Allowlist_key  stream
        Allowlist_key  message
        Allowlist_key  log
        Allowlist_key  kubernetes
   
    [OUTPUT]
        Name           newrelic
        Match          *
        licenseKey     ${LICENSE_KEY}
        endpoint       ${ENDPOINT}
        lowDataMode    ${LOW_DATA_MODE}

这是我们使用的解析器配置:

  parsers.conf: |

    [MULTILINE_PARSER]
        name          multiline-regex-error-trace
        type          regex
        flush_timeout 1000
        rule      "start_state"   "/([0-9]{2,4}\-[0-9]{1,2}\-[0-9]{1,2} [0-9]{1,2}\:[0-9]{1,2}\:[0-9]{1,2}\,[0-9]{2,4}) (.*)/"    "stacktraceline2"
        rule      "stacktraceline2"          "/^([a-z]{1,10})\.(.*)/"                                                              "stacktraceline3"
        rule      "stacktraceline3"          "/^\s+at.*/"                                                                          "stacktraceline3"
    
    [PARSER]
        Name         docker
        Format       json
        Time_Key     time
        Time_Format  %Y-%m-%dT%H:%M:%S.%L
        Time_Keep    On
    
    [PARSER]
        Name cri
        Format regex
        Regex ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<message>.*)$
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L%z

如果我们在“INPUT”配置下为“Path”键提及“/var/log/containers/*.log”,我们将指示 fluenbit 从该目录收集所有日志,并进行处理,感谢上帝的恩典,它可以正常工作完美。 但是,在这种情况下,自定义多行解析器“multiline-regex-error-trace”似乎不适用于所有 pod 日志。 日志仅针对其中一个 pod 进行连接,我们看到每个堆栈跟踪行都是针对所有剩余的 pod 单独推送的。

为了使这个解析器适用于特定的 pod,我们需要将路径定义为“/var/log/containers/myapp-svc1-.log” 或“/var/log/containers/myapp-svc2-.log”,即取决于 pod 名称。
但这不是所需的配置,因为定义 pod 特定路径将限制仅针对节点上该 pod 的日志收集,我们需要所有 pod 的日志。

这是日志示例文件:

2022-09-02 18:46:53,206 ERROR 5d9073b1-9f90-42c1-b7a2-d5a6c13f2669 [http-nio-9002-exec-3] i.f.m.c.c.ContentController: Exception recevied from the service
    java.lang.Exception: Custom error
        at myapp.content.controller.ContentController.throwError(ContentController.java:46)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.base/java.lang.reflect.Method.invoke(Unknown Source)
        at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:197)
        at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:141)
        at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:106)
        at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:894)
        at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:808)
        at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)
        at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1060)
        at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:962)
        at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006)
        at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:898)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:626)
        at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:883)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:733)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:227)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
        at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
        at myapp.content.filter.RequestIdAddingFilter.doFilterInternal(RequestIdAddingFilter.java:51)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
        at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
        at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
        at org.springframework.boot.actuate.metrics.web.servlet.WebMvcMetricsFilter.doFilterInternal(WebMvcMetricsFilter.java:93)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
        at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:202)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:97)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:542)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:143)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:78)
        at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:764)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:346)
        at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:374)
        at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
        at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:887)
        at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1684)
        at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.base/java.lang.Thread.run(Unknown Source)

我们尝试使用多行内置的 fluentbit 解析器“java”。但它还有另一个问题。 我们的堆栈跟踪中有 3 种类型的日志行。

**Line Type 1:**   2022-09-02 18:46:53,206 ERROR 5d9073b1-9f90-42c1-b7a2-d5a6c13f2669 [http-nio-9002-exec-3] i.f.m.c.c.ContentController: Exception recevied from the service

**Line Type 2:**   java.lang.Exception: Custom error

**Line Type 3:**   
at myapp.content.controller.ContentController.throwError(ContentController.java:46)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.base/java.lang.reflect.Method.invoke(Unknown Source)

内置的“java”解析器似乎只结合了最后两种类型“Line Type 2”和“Line Type 3”,它排除了第一行。 newrelic bundle 内部使用 fluent bit 1.9.4,这是今天的最新版本。 要重现此问题,只需在 k8s 节点上运行 2 个 pod,然后查看堆栈跟踪是否为每个具有 fluent bit 1.9.4 的 pod 串联起来。 确保日志样本如上所述。 有人可以帮我们解决这个问题吗,过去 7 天我一直在用头撞墙?

kubernetes newrelic fluent-bit
2个回答
0
投票

你的问题解决了吗? 我有类似的问题并将流利位版本更新到最新


-1
投票

是不是放错区了? 在docs中,他们将

multiline.parser
添加到
INPUT
而不是
[FILTER]
部分。有点像

    [INPUT]
        Name              tail
        Tag               kube.*
        Path              ${PATH}
        Parser            ${LOG_PARSER}
        DB                ${FB_DB}
        Mem_Buf_Limit     7MB
        Skip_Long_Lines   On
        Refresh_Interval  10
        multiline.parser  multiline-regex-error-trace
© www.soinside.com 2019 - 2024. All rights reserved.