在练习简单线性回归模型时我得到了这个错误,我认为我的数据集有问题。
Here is independent variable X:
这是错误正文:
ValueError: Expected 2D array, got 1D array instead:
array=[ 7. 8.4 10.1 6.5 6.9 7.9 5.8 7.4 9.3 10.3 7.3 8.1].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
这是我的代码:
import pandas as pd
import matplotlib as pt
#import data set
dataset = pd.read_csv('Sample-data-sets-for-linear-regression1.csv')
x = dataset.iloc[:, 1].values
y = dataset.iloc[:, 2].values
#Spliting the dataset into Training set and Test Set
from sklearn.cross_validation import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size= 0.2, random_state=0)
#linnear Regression
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(x_train,y_train)
y_pred = regressor.predict(x_test)
谢谢
您需要同时给出fit
和predict
方法2D数组。你的x_train
,y_train
和x_test
目前只有1D。控制台建议的应该是:
x_train= x_train.reshape(-1, 1)
y_train= y_train.reshape(-1, 1)
x_test = x_test.reshape(-1, 1)
这使用了numpy的reshape
。关于reshape
的问题在过去已被回答,例如,这应该回答reshape(-1,1)
的意思:What does -1 mean in numpy reshape?
如果你看看LinearRegression
of scikit-learn的文档。
适合(X,y,sample_weight =无)
X:numpy数组或形状稀疏矩阵[n_samples,n_features]
预测(X)
X:{array-like,sparse matrix},shape =(n_samples,n_features)
正如你所看到的X
有两个维度,你的x_train
和x_test
显然有一个维度。如建议的那样,添加:
x_train = x_train.reshape(-1, 1)
x_test = x_test.reshape(-1, 1)
在拟合和预测模型之前。
这是你的答案。
使用:y_pred = regressor.predict([[x_test]])
我会帮你的。