python 正则表达式分割字符串中的特定文本

问题描述 投票:0回答:2

我正在阅读一个文本文件,我需要根据特定单词进行拆分。

我想拆分 NUMBER(x, x),

这是我的代码:

import re

str1 = "PERSON_LEAVE_BAL_NUMBER NUMBER(5, 2),"
str2 = "CURRENT_BILL NUMBER(*, 2),"



# desired output
#['PERSON_LEAVE_BAL_NUMBER', 'NUMBER(5, 2)']
#['CURRENT_BILL', 'NUMBER(*, 2)']


if re.search(r'\bNUMBER\b', str1):
    str1_items = re.split("NUMBER[^a-zA-Z ]+",str1) 
    print (str1_items)

if re.search(r'\bNUMBER\b', str2):
    str2_items = re.split("NUMBER[^a-zA-Z ]+",str2) 
    print (str2_items)

我正在搜索单词边界,然后尝试选择数字部分,但我无法正确选择和解析它。有什么建议吗?

python python-3.x regex string split
2个回答
0
投票

拆分字符串会返回字符串两侧的文本部分。

您可以放回绳子:

split1 = "PERSON_LEAVE_BAL_NUMBER NUMBER(5, 2),".split("NUMBER")
print(split1[0], "NUMBER" + split1[0])

或在零宽度断言上进行拆分,这要求您拆分的字符串后跟您需要出现的标记:

split2 = re.split(r" (?=NUMBER[^a-zA-Z ])", "CURRENT_BILL NUMBER(*, 2),")
print(split2)

0
投票

让你的模式更加明确:

import re

testString = "PERSON_LEAVE_BAL_NUMBER NUMBER(5, 2),"
pattern = r"([\S]*) NUMBER\(([\d]*)[\W\,]+([\d]*)\)\,"

items = re.match(pattern, testString)
name, firstnumber, secondnumber = items.groups()
print(name, firstnumber, secondnumber)

出:

PERSON_LEAVE_BAL_NUMBER 5 2
© www.soinside.com 2019 - 2024. All rights reserved.