问题的简短版本:
awk
中的RS根据第n个字段为空的每一行分割记录? (如果该行完全为空,则设置 RS="\n\n ..."
即可。
长版: 这就是我的日志文件的样子(注意与
**amd64**
和 **arm64**
相关的相互交织的部分):
...
2023-12-29T16:05:20.3032116Z
2023-12-29T16:05:20.3040485Z #10 [linux/arm64 builder 1/8] WORKDIR /app
2023-12-29T16:05:20.4084773Z #10 DONE 0.8s
2023-12-29T16:05:20.4085104Z
2023-12-29T16:05:20.4085552Z #11 [linux/amd64 builder 1/8] WORKDIR /app
2023-12-29T16:05:20.5499792Z #11 DONE 0.1s
2023-12-29T16:05:20.5505699Z
2023-12-29T16:05:20.5509862Z #12 [linux/amd64 builder 2/8] RUN apk add --no-cache libc6-compat
2023-12-29T16:05:20.5512029Z #12 0.138 fetch https://dl-cdn.alpinelinux.org/alpine/v3.19/main/x86_64/APKINDEX.tar.gz
2023-12-29T16:05:20.6982466Z #12 ...
2023-12-29T16:05:20.6983744Z
2023-12-29T16:05:21.2474882Z #16 [linux/arm64 runner 2/7] RUN addgroup -S -g 1001 nodejs
2023-12-29T16:05:21.3971789Z #16 ...
2023-12-29T16:05:21.3972318Z
...
.... 可以看出,每个部分都以一行结尾,该行不包含任何内容除了时间戳
目标是分别打印 amd64 和 arm64 的部分(行),例如(适用于 amd64):
2023-12-29T16:05:20.4085104Z <-- may be present or not in output, no need for an 'overkill' solution to keep/remove it
2023-12-29T16:05:20.4085552Z #11 [linux/amd64 builder 1/8] WORKDIR /app
2023-12-29T16:05:20.5499792Z #11 DONE 0.1s
2023-12-29T16:05:20.5505699Z <-- may be present or not in output, no need for an 'overkill' solution to keep/remove it
2023-12-29T16:05:20.5509862Z #12 [linux/amd64 builder 2/8] RUN apk add --no-cache libc6-compat
2023-12-29T16:05:20.5512029Z #12 0.138 fetch https://dl-cdn.alpinelinux.org/alpine/v3.19/main/x86_64/APKINDEX.tar.gz
理想的解决方案将:
awk
,除非 sed
& co. 中的解决方案。真的有点矫枉过正,更“像脚本”awk
)以下解决方案仅(部分)有效,但仅当如果日志没有在空行中有任何字段(例如没有时间戳字段):
awk -vRS='\n\n' -vORS='\n\n' '/amd64 builder/ 1' logfile
。
然而,作为一个额外的问题:为什么(以及如何纠正它)这个解决方案在输出的first部分打印两次,,搜索的关键字,即我的例子中的amd64?其他(后续)部分只有关键字 once (如预期)?
谢谢
$ awk -v tgt='amd64' 'NF<2{f=""; next} !f{f=($3 ~ ("/"tgt"$"))} f' file
2023-12-29T16:05:20.4085552Z #11 [linux/amd64 builder 1/8] WORKDIR /app
2023-12-29T16:05:20.5499792Z #11 DONE 0.1s
2023-12-29T16:05:20.5509862Z #12 [linux/amd64 builder 2/8] RUN apk add --no-cache libc6-compat
2023-12-29T16:05:20.5512029Z #12 0.138 fetch https://dl-cdn.alpinelinux.org/alpine/v3.19/main/x86_64/APKINDEX.tar.gz
2023-12-29T16:05:20.6982466Z #12 ...
$ awk -v tgt='arm64' 'NF<2{f=""; next} !f{f=($3 ~ ("/"tgt"$"))} f' file
2023-12-29T16:05:20.3040485Z #10 [linux/arm64 builder 1/8] WORKDIR /app
2023-12-29T16:05:20.4084773Z #10 DONE 0.8s
2023-12-29T16:05:21.2474882Z #16 [linux/arm64 runner 2/7] RUN addgroup -S -g 1001 nodejs
2023-12-29T16:05:21.3971789Z #16 ...
我不知道为什么你认为将
RS
设置为 \n\n
会对你有用,它会失败,因为这样做与你的问题无关。
一个
awk
想法:
awk -v arch="arm64" ' # assign awk variable "arch" the name of the chip architecture
function print_block() {
if (block != "" && block ~ arch) # if awk variable "block" is not empty and also contains "arch" then ...
print block # print current contents of "block"
block = "" # re-init "block"
}
NF<2 { print_block() } # if missing the 2nd field then see if we need to print current contents of "block"
NF>1 { block = block (block ? ORS : "") $0 } # if 2nd field exists then append to "block"; if block is not empty we append with ORS else if block is empty we append with ""
END { print_block() } # print last "block"?
' logfile
使用
-v arch="arm64"
生成:
2023-12-29T16:05:20.3040485Z #10 [linux/arm64 builder 1/8] WORKDIR /app
2023-12-29T16:05:20.4084773Z #10 DONE 0.8s
2023-12-29T16:05:21.2474882Z #16 [linux/arm64 runner 2/7] RUN addgroup -S -g 1001 nodejs
2023-12-29T16:05:21.3971789Z #16 ...
使用
-v arch="amd64"
生成:
2023-12-29T16:05:20.4085552Z #11 [linux/amd64 builder 1/8] WORKDIR /app
2023-12-29T16:05:20.5499792Z #11 DONE 0.1s
2023-12-29T16:05:20.5509862Z #12 [linux/amd64 builder 2/8] RUN apk add --no-cache libc6-compat
2023-12-29T16:05:20.5512029Z #12 0.138 fetch https://dl-cdn.alpinelinux.org/alpine/v3.19/main/x86_64/APKINDEX.tar.gz
2023-12-29T16:05:20.6982466Z #12 ...