没有Python的预建功能的Netwon方法:梯度和Hessian的计算

问题描述 投票:0回答:1

我正在尝试编写不带预求解器的基本牛顿方法。

这是功能:

## definition of variables
x_1, x_2 = sym.symbols("x_1 x_2")

a_T=np.array([[0.3],[0.6],[0.2]])
b_T=np.array([5,26,3])
c_T=np.array([40,1,10])

u= x_1-0.8
v= x_2-(a_T[0]+a_T[1]*u**2*(1-u)**(1/2)-a_T[2]*u)
alpha= -b_T[0]+(b_T[1]*u**2)*(1+u)**(1/2)+(b_T[2])*u
beta= c_T[0]*v**2*(1-c_T[1]*v)/(1+c_T[2]*u**2)

## function
f = alpha**(-beta)

我计算了梯度和Hessian并定义了其他参数:

## gradient
gradient_cal = sym.Matrix(1,2,sym.derive_by_array(f, (x_1, x_2)))
## hessian
hessian_cal = sym.Matrix(2, 2, sym.derive_by_array(gradient_cal, (x_1, x_2)))
# initial guess
x_A= Matrix([[1],[0.5]])
xk = x_A
#tolerance
epsilon= 1e-10
#maximum iterations
max_iter=100

以及函数本身:

def newton(gradient_cal,hessian_cal,xk,epsilon,max_iter):
    for k in range(0,max_iter):
        fxk = gradient_cal.evalf(subs={x_1:xk[0], x_2:xk[1]})  
        if fxk.norm() < epsilon:
            print('Found solution after',k,'iterations.')
            return xk
        Dfxk = hessian_cal.evalf(subs={x_1: xk[0], x_2: xk[1]})
        if Dfxk == 0:
            print('Zero derivative. No solution found.')
            return None
        A=hessian_cal.evalf(subs={x_1: xk[0], x_2: xk[1]})
        B=gradient_cal.evalf(subs={x_1: xk[0], x_2: xk[1]})
        pk= (A.inv().dot(B))
        xk = np.subtract(xk, pk)
    print('Exceeded maximum iterations. No solution found.')
    return None

approx = newton(gradient_cal,hessian_cal,x_A,epsilon,max_iter)
print(approx)

显示以下错误:

TypeError: Shape should contain integers only.

我检查了一下,发现黑森州包含“ I”值。因此,我不确定梯度和Hessian的计算是否正确。

对于这样一个复杂的函数,有没有人有更好的解决方案来计算梯度和Hessian?

python-2.7 numpy sympy newtons-method
1个回答
1
投票

Jacobian电池已经包含在SymPy中:

>>> from sympy.abc import x, y
>>> f = x/y + x*y**2
>>> Matrix([f]).jacobian((x,y))
Matrix([[y**2 + 1/y, 2*x*y - x/y**2]])
>>> _.jacobian((x,y))  # Hessian
Matrix([
[           0,   2*y - 1/y**2],
[2*y - 1/y**2, 2*x + 2*x/y**3]])

所以您可以尝试

x_1, x_2 = sym.symbols("x_1 x_2")
xx  = x_1, x_2
a_T=[0.3,0.6,0.2]
b_T=[5,26,3]
c_T=[40,1,10]

u= x_1-0.8
v= x_2-(a_T[0]+a_T[1]*u**2*(1-u)**(1/2)-a_T[2]*u)
alpha= -b_T[0]+(b_T[1]*u**2)*(1+u)**(1/2)+(b_T[2])*u
beta= c_T[0]*v**2*(1-c_T[1]*v)/(1+c_T[2]*u**2)

## function
f = alpha**(-beta)

jac = Matrix([f]).jacobian(xx)
hes = jac.jacobian(xx)
© www.soinside.com 2019 - 2024. All rights reserved.