我正在使用Scikit中的最近邻回归 - 在Python中学习20个最近邻居作为参数。我训练了模型,然后使用以下代码保存它:
knn = neighbors.KNeighborsRegressor(n_neighbors, weights='uniform')
knn.fit(trainInputs, trainOutputs)
filename = "KNN_model_%d_%d.sav" % (n_neighbors,windowSize)
pickle.dump(knn, open(filename, 'wb'))
现在我尝试加载模型并使用此方法预测新输入的输出值:
filename = 'KNN_model_20_720.sav'
loaded_knn_model = pickle.load(open(filename, 'rb'))
nextPrediction = loaded_knn_model.predict(data_pred_input_window)
但是,当我这样做时,我收到此错误:
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-1-bc1f744a44b3> in <module>()
26 filename = 'KNN_model_20_720_Solar11months.sav'
27 loaded_knn_model = pickle.load(open(filename, 'rb'))
---> 28 nextPrediction = loaded_knn_model.predict(data_pred_input_window)
29
30 print(nextPrediction)
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\neighbors\regression.py in predict(self, X)
142 X = check_array(X, accept_sparse='csr')
143
--> 144 neigh_dist, neigh_ind = self.kneighbors(X)
145
146 weights = _get_weights(neigh_dist, self.weights)
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\neighbors\base.py in kneighbors(self, X, n_neighbors, return_distance)
341 "Expected n_neighbors <= n_samples, "
342 " but n_samples = %d, n_neighbors = %d" %
--> 343 (train_size, n_neighbors)
344 )
345 n_samples, _ = X.shape
ValueError: Expected n_neighbors <= n_samples, but n_samples = 1, n_neighbors = 20
我不知道为什么会这样。我知道我只给出1个输入来测试预测,但是不应该抛出错误,因为我会假设保存的模型会保存历史数据来运行knn?我该如何解决这个问题?
Scikit-Learn docs建议使用joblib
进行模型持久化。
from sklearn.externals import joblib
knn = neighbors.KNeighborsRegressor(n_neighbors, weights='uniform')
knn.fit(trainInputs, trainOutputs)
joblib.dump(knn, f"KNN_model_{n_neighbors}_{windowSize}.joblib")
# load the model from a file
model = joblib.load(f"KNN_model_{n_neighbors}_{windowSize}.joblib")
此外,在您的原始代码中,我注意到您在opening your files时没有使用上下文块。这可能会也可能不会使您的原始代码正常工作。