不能通过Logistic回归中的梯度下降减少损失

问题描述 投票:0回答:1

我正在开始机器学习,并且正在尝试从头开始在Kaggle Titanic dataset上实现Logistic回归。我编写的代码是从在线课程中学到的,在这里我无法实现梯度下降。问题是在计算了W和B梯度并在名为logisitic_regression的函数中实现更新后,其中W = W-alpha wgrad并且b = b-alpha bgrad,由于某种原因的损失不会减少,并且W和b参数不会更新。我似乎在代码中找不到错误,有人可以帮忙吗?请参阅以下功能。如果您需要更多信息,请告诉我。

#Implement sigmoid action potenial function
def sigmoid(z):
    '''
    Input:
        z: Scalar or arry of dimension n
    Output:
    sgmd: Scalar or array of dimension n

    '''

    sgmd = 1/(1+np.exp(-z))
    return sgmd



#Define prediction function
def yPredLogistic(X, w, b=0):
   '''
   Input:
    X: nxd matric
    w: d-dimensional vector
    b: scalar (optional, if not pass on is treated as 0)
   Output:
    prob: n-dimensional vector
   '''
   prob = sigmoid(np.inner(X,w.T) +b)

   return prob


#Define negative loglikelihood as log oss 
def log_loss(X, y, w, b=0):
   '''
   Input:
    X: nxd matrix 
    y: n-dimensional vector with labels (+1 or -1)
    w: d=dimensional vector 
   Output:
    nll: a scalar 
   '''


    nll = -np.sum(np.log(sigmoid(y*(np.inner(w.T,X) +b))))

    return nll

   #define gradient 
def gradient(X, y, w, b):
    '''
    Input:
     X: nxd matrix 
     y: n-dimensional vector with labels +1 or -1
     w: d-dimensional vector 
     b: scalr bias term 
   Output:
     wgrad: d-dimensional vector with gradient 
     bgrad: a scalar with gradient
  '''

    n, d = X.shape 
    #wgrad = np.zeros(d)
    #bgrad = 0.0

    #h = y - yPredLogistic(X,w, b)

    wgrad = -y*(sigmoid(-y*(np.inner(w.T,X) +b)))@X

    #partialx = -y*(sigmoid(-y*(np.inner(w.T,X) +b)))@X
    bgrad = np.sum(-y*(sigmoid(-y*(np.inner(w.T,X) +b))))

    return wgrad, bgrad


 #Implement weight update of gradient descent
def logisitic_regression(X,y, max_iter, alpha):
    '''
    Input:
     X: nxd matrix 
     y: n-dimensional vector with labels +1 or -1
     max_iter: max iterations
    alpha: learning or step rate
   Output: 
     w: d-dimensional vector 
     b: scalr bias term 
    losses: losses
'''
    n, d = X.shape
    w = np.zeros(d)
    b = 0.0
    #losses = np.zeros(max_iter)    
    losses = []
    for step in range(max_iter):


        #Get wgradient and b gradient

        wgrad, bgrad = gradient(X,y, w,b)

        w = w - alpha*wgrad

        #update b
        b = b - alpha*bgrad



        #define losses
        losses.append(log_loss(X,y,w,b))


return w, b, losses
python machine-learning logistic-regression kaggle
1个回答
0
投票

我认为问题是您尝试实现+ 1,-1输出标签的代码,而泰坦尼克号数据集的输出为0/1,而不是+/- 1,因此您必须更改算法并正确计算导数和对数损失因为这不是您用于0/1标签的对数丢失公式。

© www.soinside.com 2019 - 2024. All rights reserved.