你好,我试图比较file1和file2。
我很喜欢这个。
awk 'NR==FNR {a[$1,$3]=$0; next}
{if(($1,$3) in a)
{print a[$1,$3],$0; delete a[$1,$2]}
else print $0}
END {for(k in a) print a[k]}' file1 file2
文件1
SITE-A SERV-A AA 1.00 PPA IP 98a7df9asd7f FEX 98a7df9asd7f_a
SITE-A SERV-A AA 1.00 PPA IP 98a7df9asd7g FEX 98a7df9asd7f_b
SITE-A SERV-A AA 1.00 PPA IP 98a7df9asd7h FEX 98a7df9asd7f_c
SITE-B SERV-A BB 1.00 DF IP a7sdf9899hhh FEX a7sdf9899hhh_a
SITE-B SERV-A BB 1.00 DF IP a7sdf9899hhf FEX a7sdf9899hhh_b
SITE-B SERV-A BB 1.00 AF IP a7sdf9899hhm FEX a7sdf9899hhh_c
file2
SITE-A 17 SERV-A 0 39 idx a7sdf9899778 0 0 out_fan pri
SITE-A 17 SERV-A 1 1 test a7sdf9899779 1 0 out_fan pri
SITE-A 17 SERV-A 2 32 dummy_host a7sdf9899770 2 0 out_fan pri
SITE-C 22 SERV-A 2 519 dummy_host a7sdf9899772 2 2 out_fan pri
SITE-C 22 SERV-A 3 520 prod a7sdf9899775 3 out_fan pri
SITE-C 22 SERV-A 4 521 dev a7sdf9899774 4 out_fan pri
所需的输出。
SITE-A SERV-A idx a7sdf9899778 0
SITE-A SERV-A test a7sdf9899779 1
SITE-A SERV-A dummy_host a7sdf9899770 2
SITE-A SERV-A 98a7df9asd7f_a 98a7df9asd7f 3
SITE-A SERV-A 98a7df9asd7f_b 98a7df9asd7g 4
SITE-A SERV-A 98a7df9asd7f_c 98a7df9asd7h 5
SITE-B SERV-A a7sdf9899hhh_a a7sdf9899hhh 0
SITE-B SERV-A a7sdf9899hhh_b a7sdf9899hhf 1
SITE-B SERV-A a7sdf9899hhh_c a7sdf9899hhm 2
SITE-C SERV-A dummy_host a7sdf9899772 2
SITE-C SERV-A prod a7sdf9899775 3
SITE-C SERV-A dev a7sdf9899774 4
$ cat tst.awk
NR==FNR {
key = $1 FS $3
a[key] = a[key] key OFS $6 OFS $7 OFS $8 ORS
cnt[key]++ # or cnt[key] = $8 + 1
next
}
{
key = $1 FS $2
if ( key != prev ) {
printf "%s", a[key]
delete a[key]
prev = key
}
print key, $6, $7, $8, cnt[key]++
}
END {
for ( key in a ) {
printf "%s", a[key]
}
}
.
$ awk -f tst.awk file2 file1
SITE-A SERV-A idx a7sdf9899778 0
SITE-A SERV-A test a7sdf9899779 1
SITE-A SERV-A dummy_host a7sdf9899770 2
SITE-A SERV-A IP 98a7df9asd7f FEX 3
SITE-A SERV-A IP 98a7df9asd7g FEX 4
SITE-A SERV-A IP 98a7df9asd7h FEX 5
SITE-B SERV-A IP a7sdf9899hhh FEX 0
SITE-B SERV-A IP a7sdf9899hhf FEX 1
SITE-B SERV-A IP a7sdf9899hhm FEX 2
SITE-C SERV-A dummy_host a7sdf9899772 2
SITE-C SERV-A prod a7sdf9899775 3
SITE-C SERV-A dev a7sdf9899774 4
不清楚你是想让第5个输出字段的file1行数从file2给定的键的行数开始,还是从file2的$8值开始,所以我把两个选项都包括进去了,一个作为注释。
在这里,我加入了两个选项,一个作为注释。for ( key in a )
将以 "随机 "顺序打印file2中剩余的行数(见 https:/www.gnu.orgsoftwaregawkmanualgawk.html#Controlling-Array-Traversal),如果这是个问题,你只需要在读取file2的时候(例如在开始的时候)保留一个单独的索引递增的键数组,然后在END部分使用这个数组来按这个顺序获取键值(如 if (!(key in a)) keys[++numKeys]=key
在开始的时候),然后在END部分使用这个数组按这个顺序获取键值 (for (keynr=1; keyNr<=numKeys; keyNr++) { key=keys[keyNr] ...
),即::
$ cat tst.awk
NR==FNR {
key = $1 FS $3
if ( !(key in a) ) {
keys[++numKeys] = key
}
a[key] = a[key] key OFS $6 OFS $7 OFS $8 ORS
cnt[key]++
next
}
{
key = $1 FS $2
if ( key != prev ) {
printf "%s", a[key]
delete a[key]
prev = key
}
print key, $6, $7, $8, cnt[key]++
}
END {
for ( keyNr=1; keyNr<=numKeys; keyNr++ ) {
key = keys[keyNr]
printf "%s", a[key]
}
}
.
$ awk -f tst.awk file2 file1
SITE-A SERV-A idx a7sdf9899778 0
SITE-A SERV-A test a7sdf9899779 1
SITE-A SERV-A dummy_host a7sdf9899770 2
SITE-A SERV-A IP 98a7df9asd7f FEX 3
SITE-A SERV-A IP 98a7df9asd7g FEX 4
SITE-A SERV-A IP 98a7df9asd7h FEX 5
SITE-B SERV-A IP a7sdf9899hhh FEX 0
SITE-B SERV-A IP a7sdf9899hhf FEX 1
SITE-B SERV-A IP a7sdf9899hhm FEX 2
SITE-C SERV-A dummy_host a7sdf9899772 2
SITE-C SERV-A prod a7sdf9899775 3
SITE-C SERV-A dev a7sdf9899774 4