我试图适应随机森林树模型,但我不断遇到错误,我的 CV 迭代器被告知为空,但事实并非如此。
下面是代码片段和错误
%%time
rf_val.fit(X_train, y_train)
Fitting 0 folds for each of 32 candidates, totalling 0 fits
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 2 concurrent workers.
[Parallel(n_jobs=-1)]: Done 0 out of 0 | elapsed: 0.0s finished
ValueError Traceback (most recent call last)
<timed eval> in <module>
/opt/conda/lib/python3.7/site-packages/sklearn/model_selection/_search.py in fit(self, X, y, groups, **fit_params)
708 return results
709
--> 710 self._run_search(evaluate_candidates)
711
712 # For multi-metric evaluation, store the best_index_, best_params_ and
/opt/conda/lib/python3.7/site-packages/sklearn/model_selection/_search.py in _run_search(self, evaluate_candidates)
1149 def _run_search(self, evaluate_candidates):
1150 """Search all candidates in param_grid"""
-> 1151 evaluate_candidates(ParameterGrid(self.param_grid))
1152
1153
/opt/conda/lib/python3.7/site-packages/sklearn/model_selection/_search.py in evaluate_candidates(candidate_params)
690
691 if len(out) < 1:
--> 692 raise ValueError('No fits were performed. '
693 'Was the CV iterator empty? '
694 'Were there no candidates?')
ValueError: No fits were performed. Was the CV iterator empty? Were there no candidates?
我试图调整并适应随机森林模型,并希望随着时间的推移完成它,但我遇到了这个错误,我无法理解它。
在原始sklearn代码中,在
fit
函数下:
def evaluate_candidates(candidate_params):
candidate_params = list(candidate_params)
n_candidates = len(candidate_params)
if self.verbose > 0:
print("Fitting {0} folds for each of {1} candidates,"
" totalling {2} fits".format(
n_splits, n_candidates, n_candidates * n_splits))
out = parallel(delayed(_fit_and_score)(clone(base_estimator),
X, y,
train=train, test=test,
parameters=parameters,
**fit_and_score_kwargs)
for parameters, (train, test)
in product(candidate_params,
cv.split(X, y, groups)))
if len(out) < 1:
raise ValueError('No fits were performed. '
'Was the CV iterator empty? '
'Were there no candidates?')
在您的错误消息中:
Fitting 0 folds for each of 32 candidates, totalling 0 fits
这意味着
n_split = 0
。
n_split
定义为:
n_splits = cv.get_n_splits(X, y, groups)
在您的情况下,您的
group
设置为 None
,因此错误可能是由您的输入数据 X
和 y
引起的。