我想调整我使用Pyspark Databricks模型。
我收到以下错误:类型错误:类型“ParamGridBuilder”的对象没有LEN()
我的代码已被列出如下。
from pyspark.ml.recommendation import ALS
from pyspark.ml.evaluation import RegressionEvaluator
als = ALS(userCol = "userId",itemCol="movieId", ratingCol="rating", coldStartStrategy="drop", nonnegative = True, implicitPrefs = False)
# Imports ParamGridBuilder package
from pyspark.ml.tuning import ParamGridBuilder
# Creates a ParamGridBuilder, and adds hyperparameters
param_grid = ParamGridBuilder().addGrid(als.rank, [5,10,20,40]).addGrid(als.maxIter, [5,10,15,20]).addGrid(als.regParam,[0.01,0.001,0.0001,0.02])
evaluator = RegressionEvaluator(metricName="rmse", labelCol="rating",predictionCol="prediction")
# Imports CrossValidator package
from pyspark.ml.tuning import CrossValidator
# Creates cross validator and tells Spark what to use when training and evaluates
cv = CrossValidator(estimator = als,
estimatorParamMaps = param_grid,
evaluator = evaluator,
numFolds = 5)
model = cv.fit(training)
类型错误:类型“ParamGridBuilder”对象没有LEN()
完全错误日志:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<command-1952169986445972> in <module>()
----> 1 model = cv.fit(training)
2
3 # Extract best combination of values from cross validation
4
5 best_model = model.bestModel
/databricks/spark/python/pyspark/ml/base.py in fit(self, dataset, params)
130 return self.copy(params)._fit(dataset)
131 else:
--> 132 return self._fit(dataset)
133 else:
134 raise ValueError("Params must be either a param map or a list/tuple of param maps, "
/databricks/spark/python/pyspark/ml/tuning.py in _fit(self, dataset)
279 est = self.getOrDefault(self.estimator)
280 epm = self.getOrDefault(self.estimatorParamMaps)
--> 281 numModels = len(epm)
它简单意味着你的对象没有length属性(不像列表)。因此,在你行
param_grid = ParamGridBuilder()
.addGrid(als.rank, [5,10,20,40])
.addGrid(als.maxIter, [5,10,15,20])
.addGrid(als.regParam, [0.01,0.001,0.0001,0.02])
你应该在结尾处添加.build()
真正构建一个网格。