Python使用Regex匹配复杂字符串

问题描述 投票:1回答:2

我有以下测试字符串:

test_str = `It isn't directed at all,' said the White Rabbit;

我当前的正则表达式使用re.sub过滤掉标点符号,以便我可以执行自己的操作。

我当前的正则表达式为re.sub(r"[^A-Za-z0-9'\s]", '', test_str)

上面的输出是:

['It', "isn't", 'directed', 'at', "all'", 'said', 'the', 'White', 'Rabbit']

[假设仅存储all'时,在all处可以看到错误。

如何存储带有's的单词,又如何忽略标点符号后出现的'?在这种情况下,all,'

python regex string list filter
2个回答
0
投票

尝试以下操作:

import re
test_str = "`It isn't directed at all,' said the White Rabbit;"
a = re.sub(r"[^A-Za-z0-9'\s]", '', test_str)
a = re.sub(r"'[ ]", ' ', a)
print(a)

0
投票

尝试使用此正则表达式:

print(re.sub('["!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~''](?!\w+)', '', test_str))

输出:

It isn't directed at all said the White Rabbit
© www.soinside.com 2019 - 2024. All rights reserved.