我正在尝试使用正则表达式匹配字符串中小数点前没有数字的十进制数字。例如,我有字符串:
The dimensions of the object-1 is: 1.4 meters wide, .5 meters long, 5.6 meters high
The dimensions of the object-2 is: .8 meters wide, .11 meters long, 0.6 meters high
我只想捕获没有整数的十进制数,并在它们前面加上前导零。所以我最终想要的输出是:
The dimensions of the object-1 is: 1.4 meters wide, 0.5 meters long, 5.6 meters high
The dimensions of the object-2 is: 0.8 meters wide, 0.11 meters long, 0.6 meters high
这是我到目前为止尝试过的:
(\d+)?\.(\d+)
此表达式捕获所有小数,例如:
1.4, .5, 5.6, .8, .11, 0.6
.
但我只需要捕获没有整数的小数:
.5, .8, .11
.
使用负面回顾:
(?<!\d)(\.\d+)
按照您的意愿去做似乎很奇怪。为什么不捕获所有小数并格式化它们?
re.sub
将接受一个函数作为 repl
参数。该函数应该接受一个match
作为参数,并返回一个str
。这意味着您可以对match
做任何您想做的事情。无需查找非常特殊格式的浮动并调整它们,您可以简单地找到所有浮动并将它们格式化为 float
。每个浮点数都得到相同的处理,因此所有浮点数都将具有一致的格式。
import re
from functools import partial
data = """
The dimensions of the object-1 is: 1.4 meters wide, .5 meters long, 5.6 meters high
The dimensions of the object-2 is: .8 meters wide, .11 meters long, 0.6 meters high
"""
reg = re.compile(r'((\d+)?\.\d+)')
#here we are using partial to prime sub with all of the known data,
#so we end up with a simple 1 argument function call for replacement
floatadjust = partial(reg.sub, lambda m: f'{float(m.group(1))}')
data = floatadjust(data)
print(data)
The dimensions of the object-1 is: 1.4 meters wide, 0.5 meters long, 5.6 meters high
The dimensions of the object-2 is: 0.8 meters wide, 0.11 meters long, 0.6 meters high
你可以做一个正则表达式替换负向后看一个数字。
正则表达式 -
(?<!\d)(\.\d+)
(?<!\d)
- 检查正则表达式之前没有数字(\.\d+)
- 捕获点和一个或多个连续数字替换 -
0\1
0
- 为捕获添加一个零\1
- 对捕获组的反向引用,即以点开头的浮点数import re
string = "The dimensions of the object-1 is: 1.4 meters wide, .5 meters long, 5.6 meters high \nThe dimensions of the object-2 is: .8 meters wide, .11 meters long, 0.6 meters high"
pattern = r"(?<!\d)(\.\d+)"
match = re.search(pattern, string)
new_string = re.sub(pattern, "Number: 0\\1", string)
print(new_string)