更快的 SciPy 优化

Question

我需要找到具有数千个变量的成本函数的最小值。成本函数只是最小二乘计算，可以通过 numpy 矢量化轻松快速地计算。尽管如此，优化仍然需要相当长的时间。我的猜测是，缓慢的运行时间发生在 SciPy 的最小化器中，而不是我的成本函数中。如何更改 SciPy 最小化器的参数以加快运行速度？

示例代码：

import numpy as np
from scipy.optimize import minimize

# random data
x = np.random.randn(100, 75)

# initial weights guess
startingWeights = np.ones(shape=(100, 75))

# random y vector 
y = np.random.randn(100)


def costFunction(weights):
   # reshapes flattened weights into 2d matrix
   weights = np.reshape(weights, newshape=(100, 75))

   # weighted row-wise sum
   weighted = np.sum(x * weights, axis=1)

   # squared residuals
   residualsSquared = (y - weighted) ** 2

   return np.sum(residualsSquared)


result = minimize(costFunction, startingWeights.flatten())

Answer 1

正如评论中已经指出的那样，强烈建议为具有

N = 100*75 = 7500

变量的大问题提供准确的目标梯度。如果没有提供梯度，它将通过有限差分和

approx_derivative

函数来近似。然而，有限差分容易出错且计算成本高昂，因为梯度的each评估需要目标函数的

2*N

评估（无缓存）。

这可以通过对目标和近似梯度进行计时来轻松说明：

In [7]: %timeit costFunction(startingWeights.flatten())
23.5 µs ± 2.03 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [8]: from scipy.optimize._numdiff import approx_derivative

In [9]: %timeit approx_derivative(costFunction, startingWeights.flatten())
633 ms ± 33.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

因此，在我的机器上每次梯度评估都需要超过半秒！评估梯度的更有效方法是算法微分。使用 JAX 库，非常简单：

import jax.numpy as jnp
from jax import jit, value_and_grad

def costFunction(weights):
   # reshapes flattened weights into 2d matrix
   weights = jnp.reshape(weights, newshape=(100, 75))

   # weighted row-wise sum
   weighted = jnp.sum(x * weights, axis=1)

   # squared residuals
   residualsSquared = (y - weighted) ** 2

   return jnp.sum(residualsSquared)

# create the derivatives
obj_and_grad = jit(value_and_grad(costFunction))

这里，

value_and_grad

创建了一个评估目标的函数和梯度并返回两者，即

obj_value, grad_values = obj_and_grad(x0)

。那么让我们来计算这个函数的时间：

In [12]: %timeit obj_and_grad(startingWeights.flatten())
132 µs ± 6.62 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

因此，我们现在评估目标和梯度的速度比以前快了近 5000 倍。最后，我们可以告诉

minimize

，我们的目标函数通过设置

jac=True

返回目标和梯度。所以

minimize(obj_and_grad, x0=startingWeights.flatten(), jac=True)

应该会显着加快优化速度。

PS：您还可以尝试通过 cyipopt 包连接的最先进的 Ipopt 求解器。它还提供了一个类似 scipy 的接口，类似于 scipy.optimize.minimize。

Answer 2

成本函数只是最小二乘计算

不要使用最小化，使用最小二乘 - 如果它正是您需要的， - 示例：

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import least_squares

x= np.array([0.23, 0.66, 0.93, 1.25, 1.75, 2.03, 2.24, 2.57, 2.87, 2.98])
y= np.array([0.25, -0.27, -1.12, -0.45, 0.28, 0.13, -0.27, 0.26, 0.58, 1.03])    
plt.plot(x,y, 'bv')
plt.show()

def model(theta, t):
    return theta[0]* np.exp(t) + theta[1]*np.log(t) + theta[2]*np.sin(t) + theta[3]*np.cos(t)      #  a ∗ e^x   + b ∗ l n x + c ∗ s i n x + d ∗ c o s x

def residual(theta, t, y):       # f-a
     return (model(theta, t) - y).flatten()

theta = np.array([0.5, 0.5, 0.5, 0.5])

res = least_squares(residual, theta,   args=(x, y),  verbose=1)        # jac=jac,
print(res)

x_test = np.linspace(0, 3)
y_test = model(res.x, x_test)
plt.plot(x,y, 'bo', x_test, y_test, 'k-')
plt.show()

如您所见，没有地方可以容纳巨大的x0=startingWeights（以及最小化方法）-因为x0只是init_coeffs_param（或theta
```
）来优化您的init_data...
```
优化过程正在本地为您的每个数据点执行
再次，我怀疑您的点是否包含如此数量（75 * 100）的！
线性无关尺寸...
如果您的问题的
形式化是正确的？...

如果是 - 可以看到

LS-Optimization - 描述的一些机会（包括雅可比方法、预处理、稀疏矩阵的共轭梯度[如果你的数据是空间数据，则在 2-norm LS 之后成为你的多维数据] - 使旋转以更好地选择最小化等中的步进方向）...

更快的 SciPy 优化

问题描述投票：0回答：2

2个回答

最新问题

更快的 SciPy 优化

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2