X 有 1 个特征,但 LinearRegression 需要 10 个特征作为输入

问题描述 投票:0回答:2

我在这里看到过类似的问题,但它们似乎都是由不同的问题引起的。我试过重塑并确保它是一个二维数组,但我一直收到这个错误。这是我的代码:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.neighbors import KNeighborsClassifier
from sklearn.neighbors import KNeighborsRegressor
from io import StringIO
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn import linear_model
d = pd.read_csv("http://www.stat.wisc.edu/~jgillett/451/data/mtcars.csv")
X = d[['cyl','disp','hp','drat','wt','qsec','vs','am','gear','carb']]
y=d[['mpg']].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=0)
model = linear_model.LinearRegression()
model.fit(X_train, y_train)
pred = model.predict(X_test).reshape(-1,1)
y_test=y_test.reshape(-1,1)
model.score(pred,y_test)

我会很感激任何帮助!

python pandas jupyter linear-regression feature-engineering
2个回答
0
投票

这应该有效

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.neighbors import KNeighborsClassifier
from sklearn.neighbors import KNeighborsRegressor
from io import StringIO
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn import linear_model
d = pd.read_csv("http://www.stat.wisc.edu/~jgillett/451/data/mtcars.csv")
X = d[['cyl','disp','hp','drat','wt','qsec','vs','am','gear','carb']]
y=d[['mpg']].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=0)
model = linear_model.LinearRegression()
model.fit(X_train, y_train)
pred = model.predict(X_test).reshape(-1,1)
y_test=y_test.reshape(-1,1)
model.score(X_test,y_test)

0
投票

model.score()
的参数应该是
X_test
y_test
,而不是
pred
y_test

来自docs

Parameters
        ----------
        X : array-like of shape (n_samples, n_features)
            Test samples. For some estimators this may be a precomputed
            kernel matrix or a list of generic objects instead with shape
            ``(n_samples, n_samples_fitted)``, where ``n_samples_fitted``
            is the number of samples used in the fitting for the estimator.
        y : array-like of shape (n_samples,) or (n_samples, n_outputs)
            True values for `X`.
© www.soinside.com 2019 - 2024. All rights reserved.