Awk:设置 RS 以包含换行符和下一行的第一个(唯一)字段 // 日志文件基于自定义 RS 进行“拆分”并在其中打印匹配模式

问题描述 投票:0回答:2

问题的简短版本:

awk
中的RS根据第n个字段为空的每一行分割记录? (如果该行完全为空,则设置
RS="\n\n ..."
即可。

长版: 这就是我的日志文件的样子(注意与

**amd64**
**arm64**
相关的相互交织的部分):

...
2023-12-29T16:05:20.3032116Z 
2023-12-29T16:05:20.3040485Z #10 [linux/arm64 builder 1/8] WORKDIR /app
2023-12-29T16:05:20.4084773Z #10 DONE 0.8s
2023-12-29T16:05:20.4085104Z 
2023-12-29T16:05:20.4085552Z #11 [linux/amd64 builder 1/8] WORKDIR /app
2023-12-29T16:05:20.5499792Z #11 DONE 0.1s
2023-12-29T16:05:20.5505699Z 
2023-12-29T16:05:20.5509862Z #12 [linux/amd64 builder 2/8] RUN apk add --no-cache libc6-compat
2023-12-29T16:05:20.5512029Z #12 0.138 fetch https://dl-cdn.alpinelinux.org/alpine/v3.19/main/x86_64/APKINDEX.tar.gz
2023-12-29T16:05:20.6982466Z #12 ...
2023-12-29T16:05:20.6983744Z
2023-12-29T16:05:21.2474882Z #16 [linux/arm64 runner 2/7] RUN addgroup -S -g 1001 nodejs
2023-12-29T16:05:21.3971789Z #16 ...
2023-12-29T16:05:21.3972318Z 
...

.... 可以看出,每个部分都以一行结尾,该行不包含任何内容除了时间戳

目标是分别打印 amd64arm64 的部分(行),例如(适用于 amd64):

2023-12-29T16:05:20.4085104Z       <-- may be present or not in output, no need for an 'overkill' solution to keep/remove it
2023-12-29T16:05:20.4085552Z #11 [linux/amd64 builder 1/8] WORKDIR /app
2023-12-29T16:05:20.5499792Z #11 DONE 0.1s
2023-12-29T16:05:20.5505699Z       <-- may be present or not in output, no need for an 'overkill' solution to keep/remove it
2023-12-29T16:05:20.5509862Z #12 [linux/amd64 builder 2/8] RUN apk add --no-cache libc6-compat
2023-12-29T16:05:20.5512029Z #12 0.138 fetch https://dl-cdn.alpinelinux.org/alpine/v3.19/main/x86_64/APKINDEX.tar.gz

理想的解决方案将:

  • 不强制使用
    awk
    ,除非
    sed
    & co. 中的解决方案。真的有点矫枉过正,更“像脚本”
  • 相对容易记住/并且直观地复制重复的类似用例
  • 不要太具体,即适用于其他第一个(或n-th)字段(对于类似时间戳的格式化字段来说不需要)
  • 除了主要工具之外,不要使用任何其他额外工具(例如
    awk 

以下解决方案仅(部分)有效,但仅当如果日志没有在空行中有任何字段(例如没有时间戳字段):

awk -vRS='\n\n' -vORS='\n\n' '/amd64 builder/ 1'  logfile
然而,作为一个额外的问题:为什么(以及如何纠正它)这个解决方案在输出的first部分打印两次,,搜索的关键字,即我的例子中的amd64?其他(后续)部分只有关键字 once (如预期)?

谢谢

bash perl awk text-processing unix-text-processing
2个回答
1
投票
$ awk -v tgt='amd64' 'NF<2{f=""; next} !f{f=($3 ~ ("/"tgt"$"))} f' file
2023-12-29T16:05:20.4085552Z #11 [linux/amd64 builder 1/8] WORKDIR /app
2023-12-29T16:05:20.5499792Z #11 DONE 0.1s
2023-12-29T16:05:20.5509862Z #12 [linux/amd64 builder 2/8] RUN apk add --no-cache libc6-compat
2023-12-29T16:05:20.5512029Z #12 0.138 fetch https://dl-cdn.alpinelinux.org/alpine/v3.19/main/x86_64/APKINDEX.tar.gz
2023-12-29T16:05:20.6982466Z #12 ...

$ awk -v tgt='arm64' 'NF<2{f=""; next} !f{f=($3 ~ ("/"tgt"$"))} f' file
2023-12-29T16:05:20.3040485Z #10 [linux/arm64 builder 1/8] WORKDIR /app
2023-12-29T16:05:20.4084773Z #10 DONE 0.8s
2023-12-29T16:05:21.2474882Z #16 [linux/arm64 runner 2/7] RUN addgroup -S -g 1001 nodejs
2023-12-29T16:05:21.3971789Z #16 ...

我不知道为什么你认为将

RS
设置为
\n\n
会对你有用,它会失败,因为这样做与你的问题无关。


0
投票

一个

awk
想法:

awk -v arch="arm64" '                             # assign awk variable "arch" the name of the chip architecture

function print_block() {

    if (block != "" && block ~ arch)              # if awk variable "block" is not empty and also contains "arch" then ...
       print block                                # print current contents of "block"

    block = ""                                    # re-init "block"
}

NF<2 { print_block() }                            # if missing the 2nd field then see if we need to print current contents of "block"
NF>1 { block = block (block ? ORS : "") $0 }      # if 2nd field exists then append to "block"; if block is not empty we append with ORS else if block is empty we append with ""
END  { print_block() }                            # print last "block"?
' logfile

使用

-v arch="arm64"
生成:

2023-12-29T16:05:20.3040485Z #10 [linux/arm64 builder 1/8] WORKDIR /app
2023-12-29T16:05:20.4084773Z #10 DONE 0.8s
2023-12-29T16:05:21.2474882Z #16 [linux/arm64 runner 2/7] RUN addgroup -S -g 1001 nodejs
2023-12-29T16:05:21.3971789Z #16 ...

使用

-v arch="amd64"
生成:

2023-12-29T16:05:20.4085552Z #11 [linux/amd64 builder 1/8] WORKDIR /app
2023-12-29T16:05:20.5499792Z #11 DONE 0.1s
2023-12-29T16:05:20.5509862Z #12 [linux/amd64 builder 2/8] RUN apk add --no-cache libc6-compat
2023-12-29T16:05:20.5512029Z #12 0.138 fetch https://dl-cdn.alpinelinux.org/alpine/v3.19/main/x86_64/APKINDEX.tar.gz
2023-12-29T16:05:20.6982466Z #12 ...
© www.soinside.com 2019 - 2024. All rights reserved.