如何显示公共线（反向差异）？

Question

我有一系列文本文件，我想知道它们之间的共同行而不是不同行。命令行 Unix 或 Windows 都可以。

文件foo：

linux-vdso.so.1 =>  (0x00007fffccffe000)
libvlc.so.2 => /usr/lib/libvlc.so.2 (0x00007f0dc4b0b000)
libvlccore.so.0 => /usr/lib/libvlccore.so.0 (0x00007f0dc483f000)
libc.so.6 => /lib/libc.so.6 (0x00007f0dc44cd000)

文件栏：

libkdeui.so.5 => /usr/lib/libkdeui.so.5 (0x00007f716ae22000)
libkio.so.5 => /usr/lib/libkio.so.5 (0x00007f716a96d000)
linux-vdso.so.1 =>  (0x00007fffccffe000)

因此，鉴于上面这两个文件，所需实用程序的输出将类似于

file1:line_number, file2:line_number == matching text

（只是一个建议；我真的不在乎语法是什么）：

foo:1, bar:3 == linux-vdso.so.1 =>  (0x00007fffccffe000)

Answer 1

在 *nix 上，您可以使用 comm。问题的答案是：

comm -1 -2 file1.sorted file2.sorted 
# where file1 and file2 are sorted and piped into *.sorted

以下是

comm

的完整用法：

comm [-1] [-2] [-3 ] file1 file2
-1 Suppress the output column of lines unique to file1.
-2 Suppress the output column of lines unique to file2.
-3 Suppress the output column of lines duplicated in file1 and file2.

另请注意，在使用 comm 之前对文件进行排序很重要，如手册页中所述。

Answer 2

我在列为重复的问题上找到了这个答案。我发现 grep 比 comm 对管理员更友好，因此，如果您只想要一组匹配行（例如，对于比较 CSV 文件很有用），只需使用

grep -F -x -f file1 file2

或者简化的 fgrep 版本：

fgrep -xf file1 file2

此外，您可以使用

file2*

来通配并查找多个文件（而不仅仅是两个）共有的行。

其他一些方便的变体包括

```
-n
```
标志显示每个匹配行的行号
```
-c
```
仅计算匹配的行数
```
-v
```
仅显示文件 2 中不同的行（或使用 diff
```
）。
```

使用

comm

 速度更快，但这种速度是以必须先对文件进行排序为代价的。它作为“反向差异”并不是很有用。

Answer 3

之前有人问过：

Unix命令查找两个文件中共有的行

您也可以尝试使用 Perl（信用

在此处）：

perl -ne 'print if ($seen{$_} .= @ARGV) =~ /10$/' file1 file2

Answer 4

我刚刚从答案中学习了

comm 命令，但我想添加一些额外的内容：如果文件未排序，并且您不想触摸原始文件，则可以通过管道输出 排序命令。这使得原始文件完好无损。它可以在 Bash 中运行，但我不能说其他 shell。

comm -1 -2 <(sort file1) <(sort file2)

这可以扩展到比较命令输出，而不是文件：

comm -1 -2 <(ls /dir1 | sort) <(ls /dir2 | sort)

Answer 5

最简单的方法是：

awk 'NR==FNR{a[$1]++;next} a[$1] ' file1 file2

文件无需排序。

Answer 6

我认为

diff

实用程序本身，使用其统一的（-U）选项，可以用来实现效果。因为 diff 输出的第一列标记了该行是添加还是删除，所以我们可以查找未更改的行。

diff -U1000 file_1 file_2 | grep '^ '

数字 1000 是任意选择的，大到足以大于任何单个

diff

 输出。

这是完整、万无一失的命令集：

f1="file_1"
f2="file_2"

lc1=$(wc -l "$f1" | cut -f1 -d' ')
lc2=$(wc -l "$f2" | cut -f1 -d' ')
lcmax=$(( lc1 > lc2 ? lc1 : lc2 ))

diff -U$lcmax "$f1" "$f2" | grep '^ ' | less

# Alternatively, use this grep to ignore the lines starting
# with +, -, and @ signs.
#   grep -vE '^[+@-]'

如果您想包含刚刚移动的行，您可以在比较之前对输入进行排序，如下所示：

f1="file_1"
f2="file_2"

lc1=$(wc -l "$f1" | cut -f1 -d' ')
lc2=$(wc -l "$f2" | cut -f1 -d' ')
lcmax=$(( lc1 > lc2 ? lc1 : lc2 ))

diff -U$lcmax <(sort "$f1") <(sort "$f2") | grep '^ ' | less

Answer 7

仅供参考，我为 Windows 制作了一个小工具，执行与“grep -F -x -f file1 file2”相同的操作（因为我在 Windows 上没有找到与此命令等效的任何内容）

这是：

http://www.nerdzcore.com/?page=commonlines

用法是“CommonLines inputFile1 inputFile2 outputFile”

还提供源代码（GPL）。

Answer 8

在

Windows 中，您可以将 PowerShell 脚本与 CompareObject:

compare-object -IncludeEqual -ExcludeDifferent -PassThru (get-content A.txt) (get-content B.txt)> MATCHING.txt | Out-Null #Find Matching Lines

比较对象：

如何显示公共线（反向差异）？

问题描述投票：0回答：8

8个回答

最新问题

如何显示公共线（反向差异）？

问题描述 投票：0回答：8

8个回答

最新问题

问题描述投票：0回答：8