我正在尝试运行多元线性回归,但在绘制结果时遇到了麻烦。我正在尝试绘制3D图,我得到此输出ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (4,) and requested shape (34,)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train,y_test = train_test_split(X, Y, test_size = 0.2, random_state = 0)
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train)
y_pred = regressor.predict(X_test)
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(X.iloc[:, 0], X.iloc[:, 1], Y)
ax.plot(X.iloc[:, 0], X.iloc[:, 1], y_pred, color='red')
ax.set_xlabel('Annual Income (k$)')
ax.set_ylabel('Age')
ax.set_zlabel('Spending Score')
plt.show()
绘图命令应为:
ax.plot(X_test.iloc[:, 0], X_test.iloc[:, 1], y_pred, color='red')
因为y_pred
仅包含子集X_test
的y值,而不包含整个输入X
的y个值。
用连接的线(ax.plot
)进行绘制没有意义,因为输入数据可能没有以有意义的方式进行排序,并且即使对输入数据进行了排序也绝对没有对测试集进行排序。
我会这样绘制:
from sklearn.model_selection import train_test_split
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
# generate some data as an example
np.random.seed(1)
X = pd.DataFrame(np.random.uniform(size=(20, 2)))
Y = X[0] + 2*X[1] + np.random.normal(scale=0.2, size=(20))
X_train, X_test, y_train,y_test = train_test_split(X, Y, test_size = 0.2, random_state = 0)
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train)
y_pred = regressor.predict(X_test)
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(X[0], X[1], Y, label='data')
for x0, x1, yt, yp in zip(X_test[0], X_test[1], y_test, y_pred):
ax.plot([x0, x0], [x1, x1], [yt, yp], color='red')
ax.scatter(X_test[0], X_test[1], y_pred, color='red', marker='s', label='prediction')
ax.set_xlabel('X0')
ax.set_ylabel('X1')
ax.set_zlabel('y')
ax.legend()
fig.show()
还有其他可视化方法。您可以使用np.meshgrid
在网格上生成X
值,并从预测变量中获取y
值,然后使用plot_wireframe
对其进行绘制,并使用垂直线绘制火车和测试数据以指示其与线框的垂直距离。这取决于有意义的数据。