我一直在尝试使用sklearn对一些虚拟数据执行简单的多元线性回归。我最初通过sklearn.linear_model.LinearRegression.fit numpy数组,并不断收到此错误:
[ValueError:matmul:输入操作数1的核心尺寸0不匹配,带有gufunc签名(n?,k),(k,m?)->(n?,m?)(大小2与1不同) )
我以为是由于数组的转置或某些东西出错,所以我以相同的方式拉出tutorial that used pandas dataframes并设置了代码:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
VWC = np.array((0,0.2,0.4,0.6,0.8,1))
Sensor_Voltage = np.array((515,330,275,250,245,240))
X = np.column_stack((VWC,VWC*VWC))
df = pd.DataFrame(X,columns=["VWC","VWC2"])
target = pd.DataFrame(Sensor_Voltage,columns=["Volt"])
model = LinearRegression()
model.fit(df,target["Volt"])
x = np.linspace(0,1,30)
y = model.predict(x[:,np.newaxis])
plt.plot(VWC, Sensor_Voltage)
plt.plot(x,y,dashes=(3,1))
plt.title("Simple Linear Regression")
plt.xlabel("Volumetric Water Content")
plt.ylabel("Sensor response (4.9mV)")
plt.show()
而且我仍然得到以下回溯:
Traceback (most recent call last):
File "C:\Users\Vivian Imbriotis\AppData\Local\Programs\Python\Python37\simple_linear_regression.py", line 16, in <module>
y = model.predict(x[:,np.newaxis])
File "C:\Users\Vivian Imbriotis\AppData\Local\Programs\Python\Python37\lib\site-packages\sklearn\linear_model\_base.py", line 225, in predict
return self._decision_function(X)
File "C:\Users\Vivian Imbriotis\AppData\Local\Programs\Python\Python37\lib\site-packages\sklearn\linear_model\_base.py", line 209, in _decision_function
dense_output=True) + self.intercept_
File "C:\Users\Vivian Imbriotis\AppData\Local\Programs\Python\Python37\lib\site-packages\sklearn\utils\extmath.py", line 151, in safe_sparse_dot
ret = a @ b
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2 is different from 1)
我几个小时以来一直在为此感到头疼,但我只是不明白我在做什么错。
Scikit-learn,numpy和pandas均为最新版本;这在python 3.7.3上
已解决:我很傻,并且误解了np.newaxis的工作方式。这里的目标是使数据适合二次方,因此我只需要更改:
x = np.linspace(0,1,30)
y = model.predict(x[:,np.newaxis])
to
x = np.columnstack([np.linspace(0,1,30),np.linspace(0,1,30)**2])
y = model.predict(x)
我确定还有一种更优雅的书写方式,但是。。。
我一直在尝试使用sklearn对一些虚拟数据执行简单的多元线性回归。我最初通过sklearn.linear_model.LinearRegression.fit numpy数组,并不断得到这个...
[您使用(6,2)数据集的形状训练模型。如果您检查df的形状