出现第一个逗号的单词/句子超过15个单词

Question

我有以下代码，每10个字分割一行。

    #!/bin/bash

while read line
do
counter=1;
    for word in $line
    do
        echo -n $word" ";
    if (($counter % 10 == 0))
      then
        echo "";
    fi
    let counter=counter+1;
    done
done < input.txt

问题是分割点是第10个字。相反，我希望分割点是第一个逗号字符（仅适用于10个单词以上的句子）。

例如：

line1：测试行中的短语，我想拆分，但不知道如何。

to

第1行：测试行中的词组，line2：我想拆分，但不知道如何。

如果未找到逗号字符，则只需返回该行。

谢谢！

Answer 1

这里是一个简单的解决方案，它检查字符串中的单词数。如果字符串中的单词数大于10，则将拆分：

output = []
s = 'phrase from a test line, which I want to split, and I dont know how'
while len (s.split()) > 10:
    first_sent,s = s.split(',',1)
    output.append(first_sent)
output.append(s)

Answer 2

awk -v OFS=, 'NF > 10{ sub(/, */, ",\n", $0); print }' input.txt

或更清楚地说：

#! /bin/bash

awk -v OFS=, 'NF > 10{

    # enter this block iff words > 10

    # replace first occurence of , and additional space,
    # if any, with newline
    sub(/, */, ",\n", $0)
    print

}' input.txt

Answer 3

[一种更好的方法是使用awk并测试15个或更多的单词，如果是，则用'\n'代替", "，例如

awk 'NF >= 15 {sub (", ", "\n")}1' file

示例使用/输出

使用file输入，您将具有：

$ awk 'NF >= 15 {sub (", ", "\n")}1' file
phrase from a test line
which I want to split, and I don't know how.

（（如果您有很多行，awk将比Shell循环快几个数量级）

出现第一个逗号的单词/句子超过15个单词

问题描述投票：0回答：3

3个回答

最新问题

出现第一个逗号的单词/句子超过15个单词

问题描述 投票：0回答：3

3个回答

最新问题

问题描述投票：0回答：3