神经网络 - Scipy 最小化值错误 tnc: 无效梯度向量

Question

我是ML新手，我一直想用python实现一个神经网络，但是当我使用scipy库中的tnc方法的最小化函数时，我得到以下错误。

ValueError: tnc: invalid gradient vector.

我查了一下，在源码中发现了这样的内容


  arr_grad = (PyArrayObject *)PyArray_FROM_OTF((PyObject *)py_grad, NPY_DOUBLE, NPY_ARRAY_IN_ARRAY);

  if (arr_grad == NULL)

  {

    PyErr_SetString(PyExc_ValueError, "tnc: invalid gradient vector.");

    goto failure;

编辑：这是我把反向传播和成本函数作为我创建的Network类的方法的实现，我目前使用的是类似于Andrew Ng的ML Coursea课程中使用的[400 25 10]结构。

    def cost_function(self, theta, x, y):
        u = self.num_layers
        m = len(x)
        Reg = 0                                     # Regulaization Term init and Calculation 
        for i in range(u - 1):
            k = np.power(theta[i], 2)
            Reg = np.sum(Reg + np.sum(k))
        Reg = lmbda / (2 * m) * Reg                 
        h = self.forwardprop(x)[-1]                 # Getting the activation of the last layer
        J = (-1 / m) * np.sum(np.multiply(y, np.log(h)) + np.multiply((1 - y), np.log(1 - h))) + Reg     # Cost Func
        return J

    def backprop(self, theta, x, y):
        m = len(x)                                              # number of training example
        theta = np.asmatrix(theta)                              # 
        theta = self.rollPara(theta)                            # Roll weights into Matrices, Original shape (1, 10285), after rolling [(25, 401), (26, 10)]
        tot_delta = list(range((self.num_layers-1)))            # accumulated error init
        delta =list(range(self.num_layers-1))                   # error from each example init
        for i in range(m):                                      # loop for calculating error
            a = self.forwardprop(x[i:i+1, :])                   # get activation of each layer for ith example
            delta[-1] = a[-1] - y[i]                            # error of output layer of ith example
            for j in range(1, self.num_layers-1):               # loop to calculate error of each layer for ith example
                theta_ = theta[-1-j+1][:, 1:]                   # weights of jth layer (from back to front)('-1' represents last element)(1. weights index 2.exclude bias units)
                act = (a[:-1])[-1-j+1][:, 1:]                   # activation of current layer (1. exclude output layer layer 2. activation index 3. exclude bias units)
                delta_prv = delta[-1-j+1]                             # error of previous layer
                delta[-1-j] = np.multiply(delta_prv@theta_, act)      # error of current layer
            delta = delta[::-1]                                       # reverse the order of elements since BP starts from back to front
            for j in range(self.num_layers-1):                                                       # loop to add ith example error to accumlated error
                tot_delta[j] = tot_delta[j] + np.transpose(delta[j])@a[self.num_layers-2-j]          # add jth layer error from ith example to jth layer accumulated error

        ThetaGrad = np.add((1/m)*np.asarray(tot_delta[::-1]), (lmbda/m)*np.asarray(theta))  # calculate gradient
        grad = self.unrollPara(ThetaGrad)
        return grad

maxiter=500                        
options = {'maxiter': maxiter}
initTheta = N.unrollPara(N.weights)         # flattening into vector    
res = op.minimize(fun=N.cost_function, x0=initTheta, jac=N.backprop, method='tnc', args=(x, Y), options=options)   # x, Y are training set that are already initialized

这个是scipy的源代码

先谢谢你。

Answer 1

仔细阅读代码后，我意识到它的grad vector必须是一个list而不是NumPy数组。不知道我的实现是否正确，但错误已经消失了。

神经网络 - Scipy 最小化值错误 tnc: 无效梯度向量

问题描述投票：0回答：1

1个回答

最新问题

神经网络 - Scipy 最小化值错误 tnc: 无效梯度向量

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1