如何截断 grep 或 ack 返回的长匹配行

Question

我想对通常有很长行的 HTML 文件运行 ack 或 grep。我不想看到很长的线反复换行。但我确实想查看包围与正则表达式匹配的字符串的长行的一部分。我怎样才能使用 Unix 工具的任意组合来获得这个？

Answer 1

您可以使用 grep 选项

-oE

，可能与将模式更改为

".{0,10}<original pattern>.{0,10}"

结合使用，以便查看周围的一些上下文：

 -o, --仅匹配
              仅显示匹配行中与 PATTERN 匹配的部分。

       -E, --扩展正则表达式
             将模式解释为扩展正则表达式（即强制 grep 表现得像egrep）。

例如（来自@Renaud的评论）：

grep -oE ".{0,10}mysearchstring.{0,10}" myfile.txt

或者，您可以尝试

-c

：

 -c, --count
              抑制正常输出；而是打印匹配行的计数
              对于每个输入文件。使用 -v, --invert-match 选项（参见
              如下），计算不匹配的行数。

Answer 2

通过管道传输您的结果

cut

。我还在考虑添加一个

--cut

开关，这样你就可以说

--cut=80

并且只获得 80 列。

Answer 3

您可以使用 less 作为寻呼机来进行 ack 和截断长行：

ack --pager="less -S"

这会保留长行，但将其保留在一行上而不是换行。要查看该行的更多内容，请使用箭头键向左/向右滚动。

我有以下别名设置来执行此操作：

alias ick='ack -i --pager="less -R -S"'

Answer 4

grep -oE ".{0,10}error.{0,10}" mylogfile.txt

在无法使用

-E

的特殊情况下，请使用小写
-e
代替。

说明：

Answer 5

获取 1 到 100 之间的字符。

cut -c 1-100

您可能希望将范围基于当前终端，例如

cut -c 1-$(tput cols)

Answer 6

取自：http://www.topbug.net/blog/2016/08/18/truncate-long-matching-lines-of-grep-a-solution-that-preserves-color/

建议的方法

".{0,10}<original pattern>.{0,10}"

非常好，除了突出显示颜色经常混乱之外。我创建了一个具有类似输出的脚本，但颜色也被保留：

#!/bin/bash

# Usage:
#   grepl PATTERN [FILE]

# how many characters around the searching keyword should be shown?
context_length=10

# What is the length of the control character for the color before and after the
# matching string?
# This is mostly determined by the environmental variable GREP_COLORS.
control_length_before=$(($(echo a | grep --color=always a | cut -d a -f '1' | wc -c)-1))
control_length_after=$(($(echo a | grep --color=always a | cut -d a -f '2' | wc -c)-1))

grep -E --color=always "$1" $2 |
grep --color=none -oE \
    ".{0,$(($control_length_before + $context_length))}$1.{0,$(($control_length_after + $context_length))}"

假设脚本保存为

grepl

，那么

grepl pattern file_with_long_lines

应该显示匹配的行，但匹配字符串周围只有10个字符。

Answer 7

我将以下内容放入我的

.bashrc

中：

grepl() {
    $(which grep) --color=always $@ | less -RS
}

然后，您可以在命令行上使用

grepl

以及可用于

grep

的任何参数。使用箭头键查看较长线条的尾部。使用

退出。

说明：

```
grepl() {
```
：定义一个新函数，该函数将在每个（新）bash 控制台中可用。
```
$(which grep)
```
：获取
```
grep
```
的完整路径。（Ubuntu 为
```
grep
```
定义了一个别名，相当于
```
grep --color=auto
```
。我们不需要那个别名，而是原来的
```
grep
```
。）
```
--color=always
```
：对输出进行着色。（别名中的
```
--color=auto
```
不起作用，因为
```
grep
```
检测到输出被放入管道中，然后不会对其进行着色。）
```
$@
```
：将提供给
```
grepl
```
函数的所有参数放在这里。
```
less
```
：使用
```
less
```
```
-R
```
：显示颜色
```
S
```
：不要打破长线

Answer 8

Silver Searcher (ag) 通过

--width NUM

选项原生支持它。它将用

[...]

取代其余较长的行。

示例（120 个字符后截断）：

 $ ag --width 120 '@patternfly'
 ...
 1:{"version":3,"file":"react-icons.js","sources":["../../node_modules/@patternfly/ [...]

在 ack3 中，计划有类似的功能，但目前尚未实现。

Answer 9

这就是我所做的：

function grep () {
  tput rmam;
  command grep "$@";
  tput smam;
}

在我的 .bash_profile 中，我覆盖 grep，以便它在之前和之后自动运行

tput rmam

，从而禁用换行，然后重新启用它。

Answer 10

tput smam

也可以采用正则表达式技巧，如果您愿意的话：

ag

Answer 11

ag --column -o ".{0,20}error.{0,20}"如果行不一定适合内存

bgrep

仅当行适合内存时

才有效，但 bgrep 也适用于不适合内存的大行。我时不时地回到这个随机仓库：

https://github.com/tmbinc/bgrep

安装： grep

用途：

curl -L 'https://github.com/tmbinc/bgrep/raw/master/bgrep.c' | gcc -O2 -x c -o $HOME/.local/bin/bgrep -

输出示例：

bgrep `printf %s saf | od -t x1 -An -v | tr -d '\n '` myfile.bin

我已经在不适合内存的文件上进行了测试，效果很好。

我在以下位置提供了更多详细信息：

https://unix.stackexchange.com/questions/223078/best-way-to-grep-a-big-binary-file/758528#758528

如何截断 grep 或 ack 返回的长匹配行

问题描述投票：0回答：11

11个回答

`grep -oE ".{0,10}error.{0,10}" mylogfile.txt`

最新问题

如何截断 grep 或 ack 返回的长匹配行

问题描述 投票：0回答：11

11个回答

grep -oE ".{0,10}error.{0,10}" mylogfile.txt

最新问题

问题描述投票：0回答：11

`grep -oE ".{0,10}error.{0,10}" mylogfile.txt`