Python中tfidf稀疏矩阵的逻辑回归

问题描述 投票:0回答:1

我正在尝试从头开始编写逻辑回归,并得到以下错误。在进行数据清理和标记化之后,我已经使用sklearn的tfidfvectorizer从tweet标记创建了一个稀疏的tfidf矩阵。有人可以帮我吗?

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-36-98e5051d04b6> in <module>()
      3                   fprime=gradient,args=(x, y.values.flatten()))
      4     return opt_weights[0]
----> 5 parameters = fit(X, y, theta)

3 frames
/usr/local/lib/python3.6/dist-packages/scipy/optimize/tnc.py in func_and_grad(x)
    369     else:
    370         def func_and_grad(x):
--> 371             f = fun(x, *args)
    372             g = jac(x, *args)
    373             return f, g

TypeError: cost_function() missing 1 required positional argument: 'y'

代码:

X = tfidf_train
y = train['Sentiment']
theta = np.zeros((X.shape[1], 1))

def sigmoid(x):
    # Activation function used to map any real value between 0 and 1
    return 1 / (1 + np.exp(-x))

def net_input(theta, x):
    # Computes the weighted sum of inputs
    return np.dot(x, theta)

def probability(theta, x):
    # Returns the probability after passing through sigmoid
    return sigmoid(net_input(theta, x))

def cost_function(self, theta, x, y):
    # Computes the cost function for all the training samples
    m = x.shape[0]
    total_cost = -(1 / m) * np.sum(
        y * np.log(probability(theta, x)) + (1 - y) * np.log(
            1 - probability(theta, x)))
    return total_cost

def gradient(self, theta, x, y):
    # Computes the gradient of the cost function at the point theta
    m = x.shape[0]
    return (1 / m) * np.dot(x.T, sigmoid(net_input(theta,   x)) - y)

def fit(x, y, theta):
    opt_weights = fmin_tnc(func=cost_function, x0=theta,
                  fprime=gradient,args=(x, y.values.flatten()))
    return opt_weights[0]
parameters = fit(X, y, theta)

tfidf_train.get_shape

X is <bound method spmatrix.get_shape of <89988x49526 sparse matrix of type '<class 'numpy.float64'>'   with 987177 stored elements in Compressed Sparse Row format>>

y的形状为(89988,)] >>

我正在尝试从头开始编写逻辑回归,并得到以下错误。我已经使用sklearn的tfidfvectorizer在执行数据后根据推特令牌创建了一个稀疏的tfidf矩阵...

python machine-learning logistic-regression sentiment-analysis tf-idf
1个回答
0
投票
TypeError: cost_function() missing 1 required positional argument: 'y'
© www.soinside.com 2019 - 2024. All rights reserved.