Awk 存在问题,它收集的数字在我正在使用的日志文件中不存在

问题描述 投票:0回答:1

我正在开发一个日志分析器,它应该获取最后两个小时的日志并在运行时将它们放入文件中。

我在不同的堆栈交换答案中发现了以下示例,该答案使用了以下日志类型:

172.16.0.3 - - [31/Mar/2002:19:30:41 +0200] "GET

用于解决此特定日志类型的示例是:

# this variable you could customize, important is convert to seconds. 
# e.g 5days=$((5*24*3600))
x=$((5*60))   #here we take 5 mins as example

# this line get the timestamp in seconds of last line of your logfile
last=$(tail -n1 logFile|awk -F'[][]' '{ gsub(/\//," ",$2); sub(/:/," ",$2); "date +%s -d \""$2"\""|getline d; print d;}' )

#this awk will give you lines you needs:
awk -F'[][]' -v last=$last -v x=$x '{ gsub(/\//," ",$2); sub(/:/," ",$2); "date +%s -d \""$2"\""|getline d; if (last-d<=x)print $0 }' logFile  

然后我修改了它以适合我的时间格式,即:

10:51:53.762450 IP 151.101.193.69.https > term-IdeaPad-Flex.52876: Flags [.], ack 507, win 324, length 0

经过一些操作,我删除了不必要的 awk 代码,例如 [] 和 / ,至少据我所知,当我注意到输出为

date: invalid date ‘IP’
时,我从 $2 更改为 $1 问题是当我运行代码时,我得到以下输出:

awk: warning: escape sequence `\.' treated as plain `.'
pass
1651417024
awk: warning: escape sequence `\.' treated as plain `.'

这使用示例日志片段,但我无法弄清楚它是如何获取这个数字的。 我使用的代码是:

#!/bin/bash

truncate -s 0 twoHour.log

# this variable you could customize, important is convert to seconds. 
# e.g 5days=$((5*24*3600))
x=$((2*60*60))   #here we take 5 mins as example

# this line get the timestamp in seconds of last line of your logfile
#for the awk part, it outputs the date in seconds
last=$(tail -n1 tcpsample.log|awk -F'\.' '{ "date +%s"|getline d; print d;}' )
echo "pass"
echo $last

#this awk will give you lines you needs:
awk -F'\.' -v last=$last -v x=$x '{ "date +%s -d \""$1"\""|getline d; if (last-d<=x)print $0 }' twoHour.log

示例日志片段是:

10:51:50.896232 IP 104.16.42.63.https > term-IdeaPad-Flex.35130: Flags [S.], seq 3706520983, ack 1064868235, win 65535, options [mss 1338,nop,nop,sackOK,nop,wscale 10], length 0
10:51:50.896364 IP term-IdeaPad-Flex.35130 > 104.16.42.63.https: Flags [.], ack 1, win 502, length 0
10:51:50.897189 IP term-IdeaPad-Flex.35130 > 104.16.42.63.https: Flags [P.], seq 1:518, ack 1, win 502, length 517
10:51:50.903428 IP 104.16.42.63.https > term-IdeaPad-Flex.35130: Flags [R.], seq 1, ack 518, win 502, length 0
10:51:51.094584 IP term-IdeaPad-Flex.53527 > 192.168.48.7.domain: 5779+ [1au] PTR? 114.201.96.52.in-addr.arpa. (55)
10:51:51.305306 IP 192.168.48.7.domain > term-IdeaPad-Flex.53527: 5779 NXDomain 0/1/1 (141)
10:51:51.305669 IP term-IdeaPad-Flex.53527 > 192.168.48.7.domain: 5779+ PTR? 114.201.96.52.in-addr.arpa. (44)
10:51:51.309114 IP 192.168.48.7.domain > term-IdeaPad-Flex.53527: 5779 NXDomain 0/1/0 (130)
10:51:51.309982 IP term-IdeaPad-Flex.56047 > 192.168.48.7.domain: 20119+ [1au] PTR? 18.242.247.162.in-addr.arpa. (56)
10:51:51.510273 IP 192.168.48.7.domain > term-IdeaPad-Flex.56047: 20119 1/0/1 PTR bam-6.nr-data.net. (87)
10:51:51.919820 IP 162.159.134.234.https > term-IdeaPad-Flex.39586: Flags [P.], seq 1822:1877, ack 1, win 91, length 55
10:51:51.919863 IP term-IdeaPad-Flex.39586 > 162.159.134.234.https: Flags [.], ack 1877, win 6245, length 0
10:51:52.738739 IP 52.96.79.82.https > term-IdeaPad-Flex.36786: Flags [P.], seq 2962633948:2962633985, ack 1033019835, win 16385, length 37
10:51:52.738783 IP term-IdeaPad-Flex.36786 > 52.96.79.82.https: Flags [.], ack 37, win 564, length 0
10:51:53.142449 IP term-IdeaPad-Flex.58558 > 192.168.48.7.domain: 50837+ [1au] PTR? 82.79.96.52.in-addr.arpa. (53)
10:51:53.147047 IP 162.159.134.234.https > term-IdeaPad-Flex.39586: Flags [P.], seq 1877:1921, ack 1, win 91, length 44
10:51:53.147091 IP term-IdeaPad-Flex.39586 > 162.159.134.234.https: Flags [.], ack 1921, win 6245, length 0
10:51:53.353085 IP 192.168.48.7.domain > term-IdeaPad-Flex.58558: 50837 NXDomain 0/1/1 (139)
10:51:53.353282 IP term-IdeaPad-Flex.58558 > 192.168.48.7.domain: 50837+ PTR? 82.79.96.52.in-addr.arpa. (42)
10:51:53.360629 IP 192.168.48.7.domain > term-IdeaPad-Flex.58558: 50837 NXDomain 0/1/0 (128)
10:51:53.482516 IP term-IdeaPad-Flex.42192 > aeab55d76dd13c9bb.awsglobalaccelerator.com.https: Flags [P.], seq 1:97, ack 194, win 501, length 96
10:51:53.557493 IP aeab55d76dd13c9bb.awsglobalaccelerator.com.https > term-IdeaPad-Flex.42192: Flags [P.], seq 194:236, ack 97, win 517, length 42
10:51:53.557541 IP term-IdeaPad-Flex.42192 > aeab55d76dd13c9bb.awsglobalaccelerator.com.https: Flags [.], ack 236, win 501, length 0
10:51:53.557559 IP aeab55d76dd13c9bb.awsglobalaccelerator.com.https > term-IdeaPad-Flex.42192: Flags [P.], seq 236:300, ack 97, win 517, length 64
10:51:53.557569 IP term-IdeaPad-Flex.42192 > aeab55d76dd13c9bb.awsglobalaccelerator.com.https: Flags [.], ack 300, win 501, length 0
10:51:53.610272 IP term-IdeaPad-Flex.42192 > aeab55d76dd13c9bb.awsglobalaccelerator.com.https: Flags [P.], seq 97:178, ack 300, win 501, length 81
10:51:53.659868 IP aeab55d76dd13c9bb.awsglobalaccelerator.com.https > term-IdeaPad-Flex.42192: Flags [P.], seq 300:374, ack 178, win 517, length 74
10:51:53.659916 IP term-IdeaPad-Flex.42192 > aeab55d76dd13c9bb.awsglobalaccelerator.com.https: Flags [.], ack 374, win 501, length 0
10:51:53.707200 IP term-IdeaPad-Flex.52876 > 151.101.193.69.https: Flags [P.], seq 2882631170:2882631322, ack 3568058341, win 3785, length 152
10:51:53.707257 IP term-IdeaPad-Flex.52876 > 151.101.193.69.https: Flags [P.], seq 152:507, ack 1, win 3785, length 355
10:51:53.762415 IP 151.101.193.69.https > term-IdeaPad-Flex.52876: Flags [.], ack 152, win 321, length 0
10:51:53.762450 IP 151.101.193.69.https > term-IdeaPad-Flex.52876: Flags [.], ack 507, win 324, length 0

如果我能弄清楚它是如何创建该数字的,我可以尝试让脚本运行,但到目前为止我最好的猜测是它以某种方式将设备的名称转换为数字,即使它应该只工作

10:51:53.762450
部分。总体目标是在几秒钟内返回两个小时并将其保存到文件中。

bash awk scripting
1个回答
0
投票

如果你有

tac
和 GNU
date
,你可以使用 awk 字符串比较,用一些简单的逻辑来处理跨越午夜:

tac logfile |
awk \
    -v now=$(date +%T                     ) \
    -v end=$(date +%T -d 'now 2 hours ago') \
'
    end<now ? $0<end : $0>now { exit }
    { print }
' |
tac

这无法返回 24 小时或更长时间。


例如:

$ filterlog(){
    now=$(date +%T -d "now $1")
    end=$(date +%T -d "now $1 2 hours ago")
    tac $2 | awk 'e<n?$0<e:$0>n{exit}1' n=$now e=$end | tac
}
$ cat in1
11:00:00. b
11:30:00. b
12:00:00. b
12:30:00. b
13:00:00. b
13:30:00. b
$ filterlog 13:45 in1
12:00:00. b
12:30:00. b
13:00:00. b
13:30:00. b
$ cat in2
23:00:00. a
23:30:00. a
00:00:00. b
00:30:00. b
01:00:00. b
01:30:00. b
$ filterlog 01:45 in2
00:00:00. b
00:30:00. b
01:00:00. b
01:30:00. b
$
© www.soinside.com 2019 - 2024. All rights reserved.