GridSearchCV
(无论是来自sklearn
还是来自dask
)对于参数来说似乎有些奇怪或错误,导致MLPRegressor忽略了该参数。我通过一个最小的工作示例展示了这种行为。假设数值初始化为features
和values
(在我的情况下)>
print(features.shape) print(values.shape) (321278, 36) (321278,)
并运行以下代码
from dask_ml.model_selection import GridSearchCV as daskGridSearchCV from sklearn.model_selection import GridSearchCV as skGridSearchCV from sklearn.neural_network import MLPRegressor myparams = {'hidden_layer_sizes': [(2, ), (4, )]} daskgridCV = daskGridSearchCV(estimator=MLPRegressor(), n_jobs=-1, param_grid=myparams) daskbestfit = daskgridCV.fit(features, values) skgridCV = skGridSearchCV(estimator=MLPRegressor(), n_jobs=-1, param_grid=myparams,cv=3) skbestfit = skgridCV.fit(features, values) display(daskbestfit) display(skbestfit)
结果
GridSearchCV(cache_cv=True, cv=None, error_score='raise', estimator=MLPRegressor(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9, beta_2=0.999, early_stopping=False, epsilon=1e-08, hidden_layer_sizes=(100,), learning_rate='constant', learning_rate_init=0.001, max_iter=200, momentum=0.9, n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5, random_state=None, shuffle=True, solver='adam', tol=0.0001, validation_fraction=0.1, verbose=False, warm_start=False), iid=True, n_jobs=-1, param_grid={'hidden_layer_sizes': [(2,), (4,)]}, refit=True, return_train_score=False, scheduler=None, scoring=None) GridSearchCV(cv=3, error_score='raise-deprecating', estimator=MLPRegressor(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9, beta_2=0.999, early_stopping=False, epsilon=1e-08, hidden_layer_sizes=(100,), learning_rate='constant', learning_rate_init=0.001, max_iter=200, momentum=0.9, n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5, random_state=None, shuffle=True, solver='adam', tol=0.0001, validation_fraction=0.1, verbose=False, warm_start=False), iid='warn', n_jobs=-1, param_grid={'hidden_layer_sizes': [(2,), (4,)]}, pre_dispatch='2*n_jobs', refit=True, return_train_score=False, scoring=None, verbose=0)
因此在两种情况下
hidden_layer_sizes
参数的值(100,)
都不在网格中。我是在做错什么,还是这里发生了什么?
python版本3.6.9sklearn版本0.21.2dask_ml-版本1.0.0
GridSearchCV(无论是来自sklearn还是来自dask)的参数似乎有些奇怪或错误,导致MLPRegressor忽略该参数。我用行为来表示行为...
这是绝对正常的。初始化estimator=MLPRegressor()
时,MLPRegressor使用默认值创建GridSearchCV的实例(((100,)