grep 可以只显示与搜索模式匹配的单词吗？

Question

有没有办法让 grep 从与搜索表达式匹配的文件中输出“单词”？

如果我想在多个文件中查找“th”的所有实例，我可以这样做：

grep "th" *

但是输出会是这样的（粗体是我写的）；

一些文本文件：the猫坐在the垫子上
一些其他文本文件：the快速棕色狐狸
还有另一个文本文件：我希望this解释它彻底

我希望它使用相同的搜索输出的是：

the
the
the
this
thoroughly

使用 grep 可以吗？或者使用其他工具组合？

Answer 1

尝试

grep -o

：

grep -oh "\w*th\w*" *

编辑：与菲尔的评论相匹配。

来自文档：

-h, --no-filename
    Suppress the prefixing of file names on output. This is the default
    when there is only  one  file  (or only standard input) to search.
-o, --only-matching
    Print  only  the matched (non-empty) parts of a matching line,
    with each such part on a separate output line.

Answer 2

跨发行版安全答案（包括windows minGW？）

grep -h "[[:alpha:]]*th[[:alpha:]]*" 'filename' | tr ' ' '\n' | grep -h "[[:alpha:]]*th[[:alpha:]]*"

如果您使用的是不包含 -o 选项的旧版本 grep（例如 2.4.2），则使用上面的内容。否则使用下面更容易维护的版本。

Linux跨发行版安全答案

grep -oh "[[:alpha:]]*th[[:alpha:]]*" 'filename'

总结一下：

-oh

输出与文件内容（而不是文件名）匹配的正则表达式，就像您期望正则表达式在 vim/etc 中工作一样...您要搜索的单词或正则表达式到那时，就由你决定了！只要您仍然使用 POSIX 而不是 Perl 语法（请参阅下文）

更多内容来自 grep 手册

-o      Print each match, but only the match, not the entire line.
-h      Never print filename headers (i.e. filenames) with output lines.
-w      The expression is searched for as a word (as if surrounded by
         `[[:<:]]' and `[[:>:]]';

原来的答案并不适合所有人的原因

\w

的用法因平台而异，因为它是扩展的“perl”语法。因此，那些仅限于使用 POSIX 字符类的 grep 安装使用

[[:alpha:]]

，而不是其 Perl 等效的

\w

。有关正则表达式的更多信息，请参阅维基百科页面

最终，无论 grep 的平台（是原始平台）如何，上面的 POSIX 答案都会更加可靠

对于不带-o选项的grep的支持，第一个grep输出相关行，tr将空格分割成新行，最后的grep仅过滤相应的行。

（PS：我知道现在大多数平台都会修补 \w.... 但总有一些平台落后）

感谢@AdamRosenfield 回答的“-o”解决方法

Answer 3

比你想象的更简单。试试这个：

egrep -wo 'th.[a-z]*' filename.txt #### (Case Sensitive)

egrep -iwo 'th.[a-z]*' filename.txt  ### (Case Insensitive)

哪里，

egrep: Grep will work with extended regular expression.
w    : Matches only word/words instead of substring.
o    : Display only matched pattern instead of whole line.
i    : If u want to ignore case sensitivity.

Answer 4

您可以将空格翻译为换行符，然后 grep，例如：

cat * | tr ' ' '\n' | grep th

Answer 5

只需

awk

，无需组合工具。

# awk '{for(i=1;i<=NF;i++){if($i~/^th/){print $i}}}' file
the
the
the
this
thoroughly

Answer 6

grep 命令仅用于匹配和 perl

grep -o -P 'th.*? ' filename

Answer 7

我对 awk 难以记住的语法不满意，但我喜欢使用一个实用程序来完成此操作的想法。

看起来 ack（或者 ack-grep 如果你使用 Ubuntu）可以轻松做到这一点：

# ack-grep -ho "\bth.*?\b" *

the
the
the
this
thoroughly

如果省略 -h 标志，您将得到：

# ack-grep -o "\bth.*?\b" *

some-other-text-file
1:the

some-text-file
1:the
the

yet-another-text-file
1:this
thoroughly

作为奖励，您可以使用

--output

标志来使用我发现的最简单的语法来执行更复杂的搜索：

# echo "bug: 1, id: 5, time: 12/27/2010" > test-file
# ack-grep -ho "bug: (\d*), id: (\d*), time: (.*)" --output '$1, $2, $3' test-file

1, 5, 12/27/2010

Answer 8

9
投票

cat *-text-file | grep -Eio "th[a-z]+"

Answer 9

您也可以尝试pcregrep。

grep

中还有一个-w选项，但在某些情况下它不能按预期工作。

来自维基百科：

cat fruitlist.txt
apple
apples
pineapple
apple-
apple-fruit
fruit-apple

grep -w apple fruitlist.txt
apple
apple-
apple-fruit
fruit-apple

Answer 10

要搜索所有以“icon-”开头的单词，以下命令非常有效。我在这里使用 Ack ，它与 grep 类似，但具有更好的选项和漂亮的格式。

ack -oh --type=html "\w*icon-\w*" | sort | uniq

Answer 11

我有一个类似的问题，寻找 grep/pattern regex 和“找到匹配的模式”作为输出。

最后我使用了egrep（grep -e或-G上的相同正则表达式没有给我相同的egrep结果）和选项-o

所以，我认为这可能类似于（我不是正则表达式大师）：

egrep -o "the*|this{1}|thoroughly{1}" filename

Answer 12

您可以将 grep 输出通过管道传输到 Perl 中，如下所示：

grep "th" * | perl -n -e'while(/(\w*th\w*)/g) {print "$1\n"}'

Answer 13

grep --color -o -E "Begin.{0,}?End" file.txt

- 尽可能少地匹配，直到

End

在macos终端上测试

Answer 14

$ grep -w

摘自 grep 手册页：

-w： 仅选择包含构成整个单词的匹配项的行。测试是匹配的子字符串必须位于行的开头，或者前面有一个非单词组成字符。

Answer 15

ripgrep

以下是使用

ripgrep

的示例：

rg -o "(\w+)?th(\w+)?"

它将匹配所有匹配

th

的单词。

grep 可以只显示与搜索模式匹配的单词吗？

问题描述投票：0回答：15

15个回答

`ripgrep`

最新问题

grep 可以只显示与搜索模式匹配的单词吗？

问题描述 投票：0回答：15

15个回答

ripgrep

最新问题

问题描述投票：0回答：15

`ripgrep`