基于半一致特征拆分字符串

问题描述 投票:-1回答:2

我有一个代表成绩单的文本文件。我需要找到一种方法来拆分它,以便我有一个表示每个人所说的字符串的列表。所以这;

mystr = '''Bob: Hello there, how are you? 

           Alice: I am fine how are you?'''

变成这个;

mylist= ['Bob: Hello there, how are you?','Alice: I am fine how are you?']

我是正则表达式的新手但是认识到这可能是要走的路。问题是我想在名称不同的情况下(例如John,Paul,George,Ringo等)对许多成绩单进行迭代。一致的是存在一个单词(代表说话者),然后是冒号,然后是白色空格。

python regex
2个回答
0
投票
re.findall(r"\S[^:]+.*", mystr)
#-> ['Bob: Hello there, how are you? ', 'Alice: I am fine how are you?']

https://docs.python.org/3/library/re.html


0
投票
import re
mystr = '''Bob: Hello there, how are you? 

           Alice: I am fine how are you?'''
[_.group(0).strip() for _ in re.finditer(r"\w[^:]+.*", mystr)]

#['Bob: Hello there, how are you?', 'Alice: I am fine how are you?']

如果冒号不存在,那么这个正则表达式应该优于前一个正则表达式。

mystr = '''Bob Hello there, how are you? 

           Alice: I am fine how are you?'''
[_.group(0).strip() for _ in re.finditer(r"\w{1,}:+.*", mystr)]
#['Alice: I am fine how are you?']
© www.soinside.com 2019 - 2024. All rights reserved.