正则表达式在Python中多次查找和后退

Question

我的输入格式如下（txt1）-

txt1 = "[('1','Hello is 1)people 2)animals'), ('People are 1) hello 2) animals'), ('a')]"

我想以以下格式提取它-

[['1','Hello is 1)people 2)animals'],['People are 1) hello 2) animals'],['a']]

所以，基本上，我希望括号中包含信息。但是我还没做到。另外，我使用了Lookahead和Lookbehind来避免被数字-“ 1）”或“ 2）”分割，这在我进行re.split('[\(\)\[\]]

的简单声明时就发生了

我一直在尝试findall函数来检查我得到的内容。

r = re.findall(r'\((?=\').*(?<=\')\)(?=\,)', txt1)

我一直在-

["('1','Hello is 1)people 2)animals'), ('People are 1) hello 2) animals')"]

似乎正在忽略中间的括号。我该怎么做才能获得所需的结果？

谢谢。

注意：

对于分割函数，我打算使用该函数来获得所需的输出，我正在获取此-

r = re.split(r'\((?=\').*(?<=\')\)(?=\,)', txt1)

['[', ", ('a')]"]

Answer 1

为什么使用正则表达式？

import ast
[list(x) if isinstance(x, tuple) else [x] for x in ast.literal_eval(txt1)]
# => [['1', 'Hello is 1)people 2)animals'], ['People are 1) hello 2) animals'], ['a']]

如果您坚持使用正则表达式，除非字符串包含转义的引号，否则这应该可以工作：

[re.findall(r"'[^']*'", x) for x in re.findall(r"\(('[^']*'(?:,\s*'[^']*')*)\)", txt1)]
# => [["'1'", "'Hello is 1)people 2)animals'"], ["'People are 1) hello 2) animals'"], ["'a'"]]

Answer 2

无需使用regex的另一种解决方案：

txt1 = "[('1','Hello is 1)people 2)animals'), ('People are 1) hello 2) animals'), ('a')]"
replace_pairs = {
    "('": "'",
    "'), ": '#',
    '[': '',
    ']': '',
    "'": '',
}
for k, v in replace_pairs.items():
    txt1 = txt1.replace(k, v)

txt1 = txt1[:-1].split('#') # the last char is a paranthesis
print([i.split(',') for i in txt1])

输出：

[['1', 'Hello is 1)people 2)animals'], ['People are 1) hello 2) animals'], ['a']]

注意：如果输入的内容比此处显示的要复杂，这可能不起作用。

正则表达式在Python中多次查找和后退

问题描述投票：0回答：2

2个回答

最新问题

正则表达式在Python中多次查找和后退

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2