在 Linux 服务器上,我有一些以竖线分隔的文件,这些文件在某些字符串列中具有竖线 (|) 字符。这些文件是用管道分隔的,但分隔符管道始终用双引号引起来,而文本/字符串中的管道可以在其周围包含任何字符、空格等。我想用破折号 (-) 替换字符串中的非分隔符管道
这是一个例子:
col1|col2|col3
"1"|"This is my column 2 |Although there is pipe here, it is not a delimiter pipe."|"And this is my 2nd column '|" with a pipe followed by a double quote"|"|and finally this 3rd column starts with a | that is not'|a delimiter
我尝试了一些 SED 和 AWK 命令,但无法从字符串的替换中正确排除分隔符(“|”)。
我正在尝试获得这样的输出:
col1|col2|col3
"1"|"This is my column 2 -Although there is pipe here, it is not a delimiter pipe."|"And this is my 2nd column '-" with a pipe followed by a double quote"|"-and finally this 3rd column starts with a - that is not'-a delimiter
我能想到的方法之一就是替换“|”用一些特殊字符(在本例中为##)并替换所有出现的 |用 - 然后将 ## 替换为“|”
cat file.txt | sed -e "s/\"|\"/##/g" | tr "|" - | sed -e "s/##/\"|\"/g"
输出-
col1-col2-col3
"1"|"This is my column 2 -Although there is pipe here, it is not a delimiter pipe."|"And this is my 2nd column '-" with a pipe followed by a double quote"|"-and finally this 3rd column starts with a - that is not'-a delimiter