正则表达式缺失匹配项

问题描述 投票:0回答:2

我有以下代码在 python 中指定我的正则表达式

import re                            # Import the regular expressions package
import matplotlib.pyplot as plt      # Import the matplotlib package
  
# Read the log file
with open('myfile.log', 'rt',encoding="latin-1") as file:
    log_text = file.read()

# Define regular expressions for pattern matching
pattern = r'Function eval (\d+) at point \d+ has f = ([-\d.]+) at x = \[ ([-\d.]+) ([-\d.]+) ([-\d.]+)\]'

# Extract data using regular expressions
matches = re.findall(pattern, log_text)

# Extract eval_numbers and f_values
eval_numbers = [int(match[0]) for match in matches]
f_values = [-float(match[1]) for match in matches]

我的日志文件如下所示

Function eval 1 at point 1 has f = -0.258940971315798 at x = [ 3.  14.4 -6.5]
Initialising (coordinate directions)
Function eval 2 at point 2 has f = -0.2259040076585 at x = [ 3.5 14.4 -6.5]
Function eval 3 at point 3 has f = -0.258912721121113 at x = [ 3.  15.9 -6.5]
Function eval 4 at point 4 has f = -0.259008279995406 at x = [ 3.  14.4 -5.8]
Function eval 5 at point 5 has f = -0.305466875532906 at x = [ 2.5 14.4 -6.5]
Function eval 6 at point 6 has f = -0.258922045649132 at x = [ 3.  12.9 -6.5]
Function eval 7 at point 7 has f = -0.258959151282265 at x = [ 3.  14.4 -7.2]
Beginning main loop
Function eval 8 at point 8 has f = -0.34817783367277 at x = [ 2.00000006 14.39986872 -6.49967721]
Function eval 9 at point 9 has f = -0.562108249792704 at x = [ 5.72367526e-08  1.43999223e+01 -6.49980500e+00]
New rho = 0.01 after 9 function evaluations
Function eval 10 at point 10 has f = -0.325941420427775 at x = [ 0.07158739 14.41254181 -6.5305591 ]
Function eval 11 at point 11 has f = -0.562106040838141 at x = [ 0.         14.35760572 -6.39667873]
Function eval 12 at point 12 has f = -0.308741672869589 at x = [ 6.97916734e-03  1.45423240e+01 -6.48009766e+00]
Function eval 13 at point 13 has f = -0.562119294825946 at x = [ 0.         14.25262353 -6.51302969]
Function eval 14 at point 14 has f = -0.322997122706926 at x = [ 0.04000959 14.17489323 -6.53416455]
Function eval 15 at point 15 has f = -0.562174526503614 at x = [ 0.         14.34595835 -6.56782799]
Function eval 16 at point 16 has f = -0.562187783721323 at x = [ 0.         14.32118157 -6.49878953]
Function eval 17 at point 17 has f = -0.562178945506711 at x = [ 0.         14.20668383 -6.45356794]
New rho = 0.001 after 17 function evaluations
Function eval 18 at point 18 has f = -0.562150223228322 at x = [ 0.         14.3725033  -6.52431188]
Function eval 19 at point 19 has f = -0.30923561054078 at x = [ 8.00416661e-03  1.43249728e+01 -6.49891060e+00]
Function eval 20 at point 20 has f = -0.562145804676754 at x = [ 0.         14.33349049 -6.48225911]
Function eval 21 at point 21 has f = -0.564667972452139 at x = [ 3.44362129e-03  1.43268836e+01 -6.49446789e+00]
Function eval 22 at point 22 has f = -0.564476463041547 at x = [ 3.17296828e-03  1.43150903e+01 -6.49015891e+00]
New rho = 0.0001 after 22 function evaluations
Function eval 23 at point 23 has f = -0.563516144031397 at x = [ 1.78656684e-03  1.43313574e+01 -6.49288379e+00]
Function eval 24 at point 24 has f = -0.564680143588853 at x = [ 3.47138233e-03  1.43297133e+01 -6.49561561e+00]
Function eval 25 at point 25 has f = -0.564590127687801 at x = [ 3.30886293e-03  1.43308568e+01 -6.49396456e+00]
Function eval 26 at point 26 has f = -0.564689063616869 at x = [ 3.51710529e-03  1.43310012e+01 -6.49596869e+00]
Function eval 27 at point 27 has f = -0.564386329819354 at x = [ 3.06544700e-03  1.43309346e+01 -6.49626736e+00]
Function eval 28 at point 28 has f = -0.564987691201283 at x = [ 3.97493644e-03  1.43315974e+01 -6.49592662e+00]
Function eval 29 at point 29 has f = -0.565043509875675 at x = [ 4.07612511e-03  1.43313188e+01 -6.49659970e+00]
Function eval 30 at point 30 has f = -0.565023413882358 at x = [ 4.02815339e-03  1.43322001e+01 -6.49716216e+00]
New rho = 1e-05 after 30 function evaluations

我想在每次函数评估时提取 f 值,但我似乎只返回随机事件,如 4、8、15 等。

关于如何更好地定义我的正则表达式以捕获每个函数评估有什么想法吗?

我试过将编码和文件处理强制为文本或二进制,但这没有区别。

日志文件由 PyBobyQa 优化日志输出创建。

regex
2个回答
0
投票

输入中有多个连续空格,但正则表达式只匹配一个空格。你可能想用

 +
 *
替换一些空格,像这样:

pattern = r'Function eval (\d+) at point \d+ has f = ([-\d.]+) at x = \[ *([-\d.]+) +([-\d.]+) +([-\d.]+) *\]'

0
投票

你的模式中匹配

x
的部分有几个问题。首先,您只允许在值之间留一个空格,其中一些行有两个。其次,您不允许在值中使用指数(如索引 9)。将正则表达式的那部分更改为此将解决这个问题:

x = \[ ([-+\d.e]+) +([-+\d.e]+) +([-+\d.e]+)\]

对于您的文本的第一部分,

re.findall
现在将返回

[
 ('1', '-0.258940971315798', '3.', '14.4', '-6.5'),
 ('2', '-0.2259040076585', '3.5', '14.4', '-6.5'),
 ('3', '-0.258912721121113', '3.', '15.9', '-6.5'),
 ('4', '-0.259008279995406', '3.', '14.4', '-5.8'),
 ('5', '-0.305466875532906', '2.5', '14.4', '-6.5'),
 ('6', '-0.258922045649132', '3.', '12.9', '-6.5'),
 ('7', '-0.258959151282265', '3.', '14.4', '-7.2'),
 ('8', '-0.34817783367277', '2.00000006', '14.39986872', '-6.49967721'),
 ('9', '-0.562108249792704', '5.72367526e-08', '1.43999223e+01', '-6.49980500e+00')
]
© www.soinside.com 2019 - 2024. All rights reserved.