在Logistic回归中使用fmin_bfgs优化Theta时出现运行错误

问题描述 投票:0回答:1
#Get libraries
import  scipy.optimize as opt
import numpy as np
import pandas
import matplotlib.pyplot as plt


def plotData():
    plt.scatter(X[y==1,0],X[y==1,1], marker='+', c='black',label="Admitted")
    plt.scatter(X[y==0,0],X[y==0,1], marker='.', c='y', label="Not Admitted")
    plt.xlabel("Exam 1 Score")
    plt.ylabel("Exam 2 Score")
    plt.legend(['Admitted', 'Not Admitted'])

def sigmoid(z):
    return 1/(1 + np.exp(-1. * z))

def costFunction(theta, X, y):
    y = np.reshape(y, (len(y), 1))
    cost = (-1./m) * ( [email protected](sigmoid(X@theta)) + (1 - y)[email protected](1 - sigmoid(X@theta)))
    #grad = (1/m)*(X.T@(sigmoid(X@theta) - y))
    return(cost[0])

def costFunctionDer(theta, X, y):
    grad = (1./m)*(X.T@(sigmoid(X@theta) - y))
    return np.ndarray.flatten(grad)

def predict(theta, X):
    return (X@theta >= 0).astype('int')

#Load Data
data = pandas.read_table('ex2data1.txt', sep=',', names=['Exam 1 Score','Exam 2 Score','Admittance'] )
X = data.iloc[:,0:2].to_numpy()
y = data.iloc[:,2].to_numpy()

#Plot Data
plotData()

#Compute Cost and Gradient

#Add intercept
X = np.insert(X, 0, 1, 1)

m, n = X.shape

#Initialise thetas
theta_i = np.zeros((n,1))


cost = costFunction(theta_i, X, y)
grad = costFunctionDer(theta_i, X, y)

#Optimise Algorithm for theta
result = opt.fmin_bfgs(costFunction, theta_i, costFunctionDer,args=(X,y), full_output = True, maxiter=400, retall=True)

theta, cost_min = result[0:2]

#Make a decision Boundary vv smart, take the min and max range of x
# and then calculate the value of score 2 from it as theta[0] + x1theta[1] + x2theta[2] = 0
boundary_xs = np.array([np.min(X[:,1]), np.max(X[:,1])])
boundary_ys = (-1./theta[2])*(theta[0] + theta[1]*boundary_xs)
plt.plot(boundary_xs,boundary_ys,'b-',label='Decision Boundary')
plt.legend()

#Predict
p = predict(theta, X)

print("Train Accuracy {}".format(100*np.sum(p==y)/len(y)))

最近我一直在通过Andrew Ng的在线课程学习机器学习。我正在做练习2,要求我们实现Logistic回归。虽然课程是用OctaveMatlab教授的,但我正在尝试用行业标准的Python来学习。

为了优化theta的值,我尝试使用函数 fmin_bfgs 而这给我带来了一个运行时错误。我已经使用了 最小 函数和代码完全正常工作。谁能帮我找到问题所在?我是ML新手,所以我为任何明显的缺陷道歉。

错误如下

RuntimeWarning: divide by zero encountered in log
  cost = (-1./m) * ( [email protected](sigmoid(X@theta)) + (1 - y)[email protected](1 - sigmoid(X@theta)))
RuntimeWarning: invalid value encountered in matmul
  cost = (-1./m) * ( [email protected](sigmoid(X@theta)) + (1 - y)[email protected](1 - sigmoid(X@theta)))
RuntimeWarning: divide by zero encountered in log
  cost = (-1./m) * ( [email protected](sigmoid(X@theta)) + (1 - y)[email protected](1 - sigmoid(X@theta)))
RuntimeWarning: invalid value encountered in matmul
  cost = (-1./m) * ( [email protected](sigmoid(X@theta)) + (1 - y)[email protected](1 - sigmoid(X@theta)))
RuntimeWarning: divide by zero encountered in log
  cost = (-1./m) * ( [email protected](sigmoid(X@theta)) + (1 - y)[email protected](1 - sigmoid(X@theta)))
RuntimeWarning: invalid value encountered in matmul
  cost = (-1./m) * ( [email protected](sigmoid(X@theta)) + (1 - y)[email protected](1 - sigmoid(X@theta)))
python python-3.x numpy scipy-optimize
1个回答
0
投票

第一个RuntimeWarning给了我需要的线索。为了解决这个问题,我在 sigmoid 函数中手动设置了下界和上界。

© www.soinside.com 2019 - 2024. All rights reserved.