我有一个巨大的特定于应用程序的日志文件,很容易逐行解析,有两种类型的日志行我想用 fluent-bit 尾部和提取以便在时间序列数据库/弹性/等中进一步处理
2023-03-25 00:33:17 < 43084:X@540435747125056> RECEIVED /req:10.45.3.24(10.45.3.24):user9458:service8457:
2023-03-25 00:33:17 < 43084:X@540435747125056> DURATION TO RESPONSE 15.178 ms::user9458:service8457:
像下面这样的大部分行在此时对进一步处理没有兴趣,应被省略:
2023-03-25 00:33:17 < 43084:X@540435747125056> <X: > request_details_of_no_interest
一开始,我只想将输出转储到两个单独的文件中,并期望两个示例行的结果如下:
传入:[1679700797.000000000, {"ip":"10.45.3.24","user":"user9458","service":"service8457"}]
完成:[1679700797.000000000,{“duration_ms”:“15.178”,“用户”:“user9458”,“服务”:“service8457”}]
理想情况下,我们通过使用自己的和更简单的正则表达式处理两种(或更多)线类型,以堆叠的方式实现这一点。 因此,如果可能的话,不要使用“更复杂”的正则表达式,它会同时匹配两种线型,同时提供所有匹配组作为键。
附加要求 通过将尾日志文件 /opt/app/logs/http.log 移动到子文件夹 /opt/app/logs/archived 然后 gzip 压缩它,同时生成新的 http.log,每天轮换。
到目前为止,我要么根据下面的解析器“my_basic”以时间戳+日志消息的形式输出到两个输出文件中,要么我在类别 1 中成功,但类别 2 根本没有输出:-(
请找到我的 patterns.conf:
[PARSER]
Name my_basic
Format regex
Regex ^(?<time>\d+-\d+-\d+ \d+:\d+:\d+)\ (?<message>.*)$
Time_Key time
Time_Format %Y-%m-%d %H:%M:%S
[PARSER]
Name my_incoming
Format regex
Regex ^(?<time>\d+-\d+-\d+ \d+:\d+:\d+)\ .*\((?<ip>\d+\.\d+\.\d+\.\d+)\):(?<user>[^:]+):(?<service>[^:]+):.*$
Time_Key time
Time_Format %Y-%m-%d %H:%M:%S
[PARSER]
Name my_done
Format regex
Regex ^(?<time>\d+-\d+-\d+\ \d+:\d+:\d+)\ .*DURATION\ TO\ RESPONSE\s+(?<duration>[\d\.]+)\ ms:.*:(?<user>[^:]+):(?<service>[^:]+):.*$
Time_Key time
Time_Format %Y-%m-%d %H:%M:%S
和fluent-bit.conf的相关部分:
[INPUT]
name tail
path /opt/app/logs/http.log*
Tag node1.market1.app
#Parser my_basic
Parser my_incoming
DB /opt/app/logs/file_status.db
# Read interval (sec) Default: 1
Refresh_Interval 10
# I would like to know if such a grep section would help in terms of performance or consuming less resources, but currently I have more basic problems ...
#[FILTER]
# name grep
# match *
# regex message RECEIVED|DURATION\ TO\ RESPONSE
[FILTER]
Name rewrite_tag
Match *.app
#Rule $message RECEIVED my_incoming true
Rule $ip \d+\.\d+\.\d+\.\d+ my_incoming true
Emitter_Name my_incoming
# Evidence of one of my dozen failed attempts so far
#[FILTER]
# Name Parser
# match my_incoming
# Key_Name message
# Parser my_incoming
[FILTER]
Name rewrite_tag
match *.app
Rule $ip ^$ my_done false
Emitter_Name my_done
[FILTER]
Name Parser
match my_done
Key_Name duration
Parser my_done
[OUTPUT]
name file
match my_incoming
path /opt/app/logs/extracted_incoming_reqs
[OUTPUT]
name file
match my_done
path /opt/app/logs/extracted_incoming_done