如何在多行表中包装文本,而不会丢失格式?

问题描述 投票:1回答:5

我有以下形式的数据(主要是日志,但也包括用户注释):

[2019_03_10][21:12:55] # Very useful text of hight iportance to demonstrate my question.
[2019_03_10][22:32:55] #  Another Text.
[2019_03_10][23:02:22] #  blablabal Bla bla, just another long text with linebreak. And this one is just a little longer then those before.

目前,我使用cat test.txt | column -s '#' -t,这就是我得到的:

[2019_03_10][21:12:55]   Very useful text of hight iportance 
to demonstrate my question. 
[2019_03_10][22:32:55]    Another Text.
[2019_03_10][23:02:22]    blablabal Bla bla, just another lon
g text with linebreak. And this one is just a little longer t
hen those before.

多数民众赞成我喜欢的东西:

[2019_03_10][21:12:55]   Very useful text of hight iportance 
                          to demonstrate my question. 
[2019_03_10][22:32:55]    Another Text.
[2019_03_10][23:02:22]    blablabal Bla bla, just another lon
                          g text with linebreak. And this one 
                          is just a little longer then those 
                          before.

必须有一个简单的方法来做到这一点。一定不?

bash shell awk text-processing
5个回答
0
投票

使用标准Text::Wrap模块的Perl脚本:

#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;
use Text::Wrap;

while (<>) {
  chomp;
  my $line = wrap("", "\t\t\t", $_);
  $line =~ s/ # /\t/;
  say $line;
}

用法:

$ perl wrap.pl test.txt
[2019_03_10][21:12:55]  Very useful text of hight iportance to demonstrate
                        my question.

单线版:

$ perl -MText::Wrap -lpe '$_ = wrap("", "\t\t\t", $_); s/ # /\t/;' test.txt

0
投票

您可以尝试以下脚本:

#!/bin/bash
WIDTH=50                                    # width of column2 
# TMPFILE=$(mktemp)                         # tempfile in /tmp
TMPFILE=$(mktemp -p /dev/shm)               # tempfile in shared memory
while read line; do                         # read all lines from log
    column1=${line%%#*}                     # extract column1
    blank=${column1//?/ }                   # blankline, size len(column1)
    column2=${line##*#}                     # column2, comments
    echo $column2 | fmt -$WIDTH > $TMPFILE  # format column2
    while read line2; do                    # read new formated lines
        echo  "$column1" $line2             # write column1 and column2
        column1=$blank                      # blank column1
    done < $TMPFILE                         # read from tempfile
done < "$1"                                 # first arg from commandline
rm $TMPFILE                                 # delete tempfile

用法:scriptname日志文件

如果没有共享内存,则可以使用第3行替换第4行。


0
投票

使用Perl命令行。您可以更改20到30或40来包装所需的长度。

$ perl -ne ' ($x,$y)=/(.+?)#(.+)/ ; print "$x"; $s=""; 
   while($y=~/(.{20,}?\s|.*$)/g) { printf("%s%s\n",$s,$1);$s="\t\t\t" } ' mac.txt
[2019_03_10][21:12:55]  Very useful text of
                        hight iportance to demonstrate
                        my question.

[2019_03_10][22:32:55]  Another Text.

[2019_03_10][23:02:22]  blablabal Bla bla, just
                        another long text with
                        linebreak. And this one
                        is just a little longer
                        then those before.


$

如果你想删除它,还有一个额外的换行符

$ perl -lne ' ($x,$y)=/(.+?)#(.+)/ ; printf("%s",$x); $s=""; 
   while($y=~/(.{20,}?\s|.*$)/g) { $p=$1; print $s,$p if $p!~/^\s*$/s; $s="\t\t\t" } ' mac.txt
[2019_03_10][21:12:55]  Very useful text of
                        hight iportance to demonstrate
                        my question.
[2019_03_10][22:32:55]  Another Text.
[2019_03_10][23:02:22]  blablabal Bla bla, just
                        another long text with
                        linebreak. And this one
                        is just a little longer
                        then those before.

$

0
投票

这似乎是的工作:

第一个脚本将通过在char 60处分割行来回答您的问题(不考虑空格)。

sed ':a;s/\([^\n]\{60\}\)\([^\n]\+\)/\1\n                       \2/;ta'

将输出如下:

[2019_03_10][21:12:55] # Very useful text of hight importanc
                       e to demonstrate my question.
[2019_03_10][22:32:55] Another Text.
[2019_03_10][23:02:22] blablabal Bla bla, just another long 
                       text with linebreak. And this one is 
                       just a lot longer than those before, 
                       by adding unsignificant and useless b
                       la bla.

Seem but word splitting:

sed ':a;/.\{61\}/s/\([^\n]\{1,60\}\) \([^\n]\+\)/\1\n                       \2/;/\n/!bb;P;D;:b;ta'

要么

sed '
    :a;
    /.\{61\}/s/\([^\n]\{1,60\}\) \([^\n]\+\)/\1\n                       \2/;
    /\n/!bb;
    P;
    D;
    :b;
    ta
'

将呈现:

[2019_03_10][21:12:55] # Very useful text of hight
                       importance to demonstrate my
                       question.
[2019_03_10][22:32:55] Another Text.
[2019_03_10][23:02:22] blablabal Bla bla, just another long
                       text with linebreak. And this one is
                       just a lot longer then those before,
                       by adding unsignificant and useless
                       bla bla.

0
投票

我将利用UNIX命令fold,因此您不必重新发明轮子:

$ cat tst.awk
{
    beg = end = $0
    sub(/ *#.*/,"",beg)
    sub(/[^#]+# */,"",end)

    cmd = "printf \047" end "\n\047 | fold -sw38"
    while ( (cmd | getline line) > 0 ) {
        print beg, line
        gsub(/./," ",beg)
    }
}

$ awk -f tst.awk file
[2019_03_10][21:12:55] Very useful text of hight iportance
                       to demonstrate my question.
[2019_03_10][22:32:55] Another Text.
[2019_03_10][23:02:22] blablabal Bla bla, just another long
                       text with linebreak. And this one is
                       just a little longer then those
                       before.
© www.soinside.com 2019 - 2024. All rights reserved.