Python-如何解析一个表格形状的字符串(带有-和+),以获取每个字段中的短语? [关闭]

问题描述 投票:0回答:1
字符串输入如下:

input_string: = """ +------------------------------------+-----------------------------------+ |title | title | +------------------------------------+-----------------------------------+ | abcdabcdabc | abcdabcd, abcdabcdabcd, abcdabcda | +------------------------------------+-----------------------------------+ | abcdabcdabcdabcdabcdabcdabcdabcda | abcd abcddrama and abcdabcdabcdabD| | | to abcd abcd abc | +------------------------------------+-----------------------------------+ """

期望的输出是每个字段中按列划分的短语:

[ (u'abcdabcdabc', u'abcdabcd, abcdabcdabcd, abcdabcda'), (u'abcdabcdabcdabcdabcdabcdabcdabcda', u"abcd abcddrama and abcdabcdabcdabD to abcd abcd abc") ]

python
1个回答
0
投票
input_string = """ +------------------------------------+-----------------------------------+ |title | title | +------------------------------------+-----------------------------------+ | abcdabcdabc | abcdabcd, abcdabcdabcd, abcdabcda | +------------------------------------+-----------------------------------+ | abcdabcdabcdabcdabcdabcdabcdabcda | abcd abcddrama and abcdabcdabcdabD| | | to abcd abcd abc | +------------------------------------+-----------------------------------+ """ def reform(s): s1 = "" s2 = "" for i in range(len(s)): if (i%3) == 1: s1 += s[i].lstrip().rstrip() + " " if (i%3) == 2: s2 += s[i].lstrip().rstrip() + " " return s1.rstrip(), s2.rstrip() sections = input_string.split("+\n")[2:4] section_split = [x.replace("-","").replace("+","").split("|") for x in sections] print([reform(x) for x in section_split])
© www.soinside.com 2019 - 2024. All rights reserved.