我正在阅读一个文本文件,我需要根据特定单词进行拆分。
我想拆分 NUMBER(x, x),
这是我的代码:
import re
str1 = "PERSON_LEAVE_BAL_NUMBER NUMBER(5, 2),"
str2 = "CURRENT_BILL NUMBER(*, 2),"
# desired output
#['PERSON_LEAVE_BAL_NUMBER', 'NUMBER(5, 2)']
#['CURRENT_BILL', 'NUMBER(*, 2)']
if re.search(r'\bNUMBER\b', str1):
str1_items = re.split("NUMBER[^a-zA-Z ]+",str1)
print (str1_items)
if re.search(r'\bNUMBER\b', str2):
str2_items = re.split("NUMBER[^a-zA-Z ]+",str2)
print (str2_items)
我正在搜索单词边界,然后尝试选择数字部分,但我无法正确选择和解析它。有什么建议吗?
拆分字符串会返回字符串两侧的文本部分。
您可以放回绳子:
split1 = "PERSON_LEAVE_BAL_NUMBER NUMBER(5, 2),".split("NUMBER")
print(split1[0], "NUMBER" + split1[0])
或在零宽度断言上进行拆分,这要求您拆分的字符串后跟您需要出现的标记:
split2 = re.split(r" (?=NUMBER[^a-zA-Z ])", "CURRENT_BILL NUMBER(*, 2),")
print(split2)
让你的模式更加明确:
import re
testString = "PERSON_LEAVE_BAL_NUMBER NUMBER(5, 2),"
pattern = r"([\S]*) NUMBER\(([\d]*)[\W\,]+([\d]*)\)\,"
items = re.match(pattern, testString)
name, firstnumber, secondnumber = items.groups()
print(name, firstnumber, secondnumber)
出:
PERSON_LEAVE_BAL_NUMBER 5 2