我正在尝试匹配文本中的编号句子:
text = """
Some explanation: 1. This is how it should work 2. This should work exactly like that
3. Just like that 4. Or with dots in the end.
"""
请注意,输入文本可以是多行或内联。
我尝试构建这样的正则表达式,但它无法按我的预期工作:
(?:\d+\.\s+.*(?:\n|$))+
。
它应该将句子提取到不同的组中:
This is how it should work
、This should work exactly like that
、Just like that
、Or with dots in the end.
。
您可以使用正则表达式组:
re.findall('\d\. ([^\d]*) ', text)
结果:
['This is how it should work', 'This should work exactly like that', 'Just like that', 'Or with dots in the']