在awk中使用数组来匹配行

Question

我正在尝试使用awk来匹配两个文件（file1和file2）。对于file2中与file1匹配的列的每一行，我希望该命令打印出file1中的第二列。

我在这里看了几个解决方案并找到了一些有用的功能（部分），但我不明白它是如何工作的。

awk 'NR==FNR {a[$1]=$2; next} $1 in a{print a[$1]}' file1 file2 >> output

以下是输入的示例：

#File1
0_1   apple
0_2   mango
0_3   banana
...
3_1   durian
3_4   dragonfruit
3_20  pear

#File2
0_1   3_1
0_1   3_1
0_2   3_4
0_3   3_20

当我将File2的第一列与File1匹配时，上面的awk命令返回我想要的结果。

#Output
apple
apple
mango
banana

所以我自然而然地调整了一行，以便对File2中的第二列进行相同的操作。

awk 'NR==FNR {a[$1]=$2; next} $2 in a{print a[$1]}' file1 file2 >> output

但是，当我期待的时候，我会得到与上面完全相同的结果：

#Expected output
durian
durian
dragonfruit
pear

更糟糕的是，当我这样做时，我得到了所需的输出：

awk 'NR==FNR {a[$1]=$2; next} $1 in a{print a[$2]}' file1 file2 >> output

有人可以向我解释这背后的逻辑（为数组赋值）或者其他地方出了什么问题吗？

Answer 1

您能否详细说明您使用的代码。它可以帮助您理解数组概念。

awk '                      ##Starting awk program from here.
NR==FNR{                   ##Checking condition FNR==NR which will be TRUE once first Input_file named file1 is being read.
  a[$1]=$2                 ##Creating an array named a whose index is $1 of current line and value is $2(2nd field) of current line.
  next                     ##next will skip all further statements from here.
}                          ##Closing BLOCK for FNR==NR condition here.
$2 in a{                   ##Checking condition(which will be executed only when 2nd Input_file named file2 is being read.
  print a[$1]              ##Now printing value of array a whose index is $1 of current line.
}                          ##Closing BLOCK for $2  in a condition here.
' file1 file2 >> output    ##Mentioning Input_file names and placing output into output file here.

关于Array概念的补充说明：

a[$1]=$2做了什么？：这意味着我们正在创建一个名为a的数组，其索引（通过其识别任何项目），其值为$ 2（当前行的第二个字段）。
a[$1]=$2的示例：让我们从第一个Input_file中获取0_1 apple的示例，其中数组将存储为a[0_1]=apple，如上所述，其索引为0_1且值为apple。
$2 in a条件有什么作用？：这个语句实际上是一个条件，它检查当前行的$ 2是否在数组a中（当然它会检查数组a的所有索引并将它们与它们进行比较，如果它们匹配或不匹配）找到匹配然后打印数值a的值，其值为a[$1]

在awk中使用数组来匹配行

问题描述投票：1回答：1

1个回答

最新问题

在awk中使用数组来匹配行

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1