Bash：搜索模式的连续重复，并将其替换为包含重复次数的字符串

Question

我使用 pandoc 将 .docx 文件转换为 .tex。原始文件是填充空白，其中重复使用

字符创建空白。

在 .tex 中，pandoc 已将其字面转换为

\_

。然而，下划线之间的空格很小，总体来说空格太长了。

我想找到像

\_\_\_

这样的字符串（三个重复的

\_

），并用像

\rule[-0.1ex]{3em}{0.5pt}

这样的tex命令替换它们。一般来说，如果N是重复次数，那么它就是

\rule[-0.1ex]{N em}{0.5pt}

。

由于所有空白都有不同的尺寸，我需要匹配所有可能的长度。我读到了 sed 中的组，但不知道如何在这里使用它们。我根本不精通正则表达式，并且对迄今为止我能找到的神秘正则表达式模式有些不知所措......

添加所需信息：

this is some text
\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_

there is some more text

even more text here
\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_.

\hfill\par

text text text
\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_.

\hfill\par

\textbf{Teilmenge}

Some text here: \_\_\_\_\_\_\_\_\_\_, and more text as well    \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_.

我没有任何东西可以作为命令，但有类似的东西

echo "$TEST" | sed 's/([\\\_]+)/\rule[-0.1ex]{length(\1) em}{0.5pt}/'

Answer 1

sed

不是正确的工具。我建议使用像这样的

awk

解决方案：

awk '{
for (i=1; i<=NF; ++i)
   if ($i ~ /\\_/)
      $i = "\\rule[-0.1ex]{" gsub(/\\_/, "", $i) "em}{0.5pt}" $i
} 1' file

this is some text
\rule[-0.1ex]{42em}{0.5pt}

there is some more text

even more text here
\rule[-0.1ex]{24em}{0.5pt}.

\hfill\par

text text text
\rule[-0.1ex]{37em}{0.5pt}.

\hfill\par

\textbf{Teilmenge}

Some text here: \rule[-0.1ex]{10em}{0.5pt}, and more text as well \rule[-0.1ex]{31em}{0.5pt}.

请注意，

gsub

函数返回输出中的替换数量，我们使用该数字来构造我们之前需要的输出

em

。

Bash：搜索模式的连续重复，并将其替换为包含重复次数的字符串

问题描述投票：0回答：1

1个回答

最新问题

Bash：搜索模式的连续重复，并将其替换为包含重复次数的字符串

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1