LogisticRegression 代码中的重塑问题

问题描述 投票:0回答:1

我尝试进行逻辑回归。

import pandas
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression
import numpy as np

df = pd.read_csv("tested.csv")
df.dropna(inplace=True)
x = df["Age"]
y = df["Survived"]
x_re = x.values.reshape(-1, 1)
reg = LogisticRegression()
reg.fit(x_re, y)
predict = reg.predict(x)
plt.title("Propability to survive")
plt.scatter(x, y, "r")
plt.plot(x, predict, "b")
plt.grid()
plt.show()

但它显示了一个错误,因为我的 x 数据没有正确的维度并且它不是数组。 这里有一个错误:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[28], line 13
     11 reg = LogisticRegression()
     12 reg.fit(x_re, y)
---> 13 predict = reg.predict(x)
     14 plt.title("Propability to survive")
     15 plt.scatter(x, y, "r")

File ~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\sklearn\linear_model\_base.py:451, in LinearClassifierMixin.predict(self, X)
    437 """
    438 Predict class labels for samples in X.
    439 
   (...)
    448     Vector containing the class labels for each sample.
    449 """
    450 xp, _ = get_namespace(X)
--> 451 scores = self.decision_function(X)
    452 if len(scores.shape) == 1:
    453     indices = xp.astype(scores > 0, int)

File ~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\sklearn\linear_model\_base.py:432, in LinearClassifierMixin.decision_function(self, X)
    429 check_is_fitted(self)
    430 xp, _ = get_namespace(X)
--> 432 X = self._validate_data(X, accept_sparse="csr", reset=False)
    433 scores = safe_sparse_dot(X, self.coef_.T, dense_output=True) + self.intercept_
    434 return xp.reshape(scores, (-1,)) if scores.shape[1] == 1 else scores

File ~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\sklearn\base.py:605, in BaseEstimator._validate_data(self, X, y, reset, validate_separately, cast_to_ndarray, **check_params)
    603         out = X, y
    604 elif not no_val_X and no_val_y:
--> 605     out = check_array(X, input_name="X", **check_params)
    606 elif no_val_X and not no_val_y:
    607     out = _check_y(y, **check_params)

File ~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\sklearn\utils\validation.py:938, in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator, input_name)
    936     # If input is 1D raise error
    937     if array.ndim == 1:
--> 938         raise ValueError(
    939             "Expected 2D array, got 1D array instead:\narray={}.\n"
    940             "Reshape your data either using array.reshape(-1, 1) if "
    941             "your data has a single feature or array.reshape(1, -1) "
    942             "if it contains a single sample.".format(array)
    943         )
    945 if dtype_numeric and hasattr(array.dtype, "kind") and array.dtype.kind in "USV":
    946     raise ValueError(
    947         "dtype='numeric' is not compatible with arrays of bytes/strings."
    948         "Convert your data to numeric values explicitly instead."
    949     )

ValueError: Expected 2D array, got 1D array instead:
array=[23.  47.  48.  22.  41.  30.  45.  45.  60.  24.  28.  25.  36.  13.
 31.  60.  28.5 35.  32.5 55.  67.  27.  76.  43.  18.5 36.  63.   1.
 36.  35.  53.  61.  23.  29.  42.  48.  54.  36.  64.  37.  18.  27.
  6.  47.  33.  42.  50.  22.  39.  64.  48.  45.  41.  27.  46.  26.
 24.  53.  64.  30.  55.  55.  57.  25.  26.  12.  39.  30.  58.  45.
 50.  59.  25.  31.  49.  54.  55.  23.  51.  18.  48.  30.  43.  20.
 50.  37.  39. ].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

我尝试使用 array.reshape(-1, 1)、values.reshape(-1, 1) 或仅 reshape(-1, 1) 但它总是显示相同的错误。在 YouTube 上,这个人只是在没有重塑功能的情况下做到了这一点。有人可以帮我吗?

python arrays pandas scikit-learn reshape
1个回答
0
投票

错误来自

predict = reg.predict(x)
,因为
x
没有被重塑。

使用这个:

predict = reg.predict(x.reshape(-1,1))
© www.soinside.com 2019 - 2024. All rights reserved.