PySpark类型错误:类型“ParamGridBuilder”对象没有LEN()

问题描述 投票:-1回答:1

我想调整我使用Pyspark Databricks模型。

我收到以下错误:类型错误:类型“ParamGridBuilder”的对象没有LEN()

我的代码已被列出如下。

from pyspark.ml.recommendation import ALS
from pyspark.ml.evaluation import RegressionEvaluator



als = ALS(userCol = "userId",itemCol="movieId", ratingCol="rating",  coldStartStrategy="drop", nonnegative = True, implicitPrefs = False)

# Imports ParamGridBuilder package
from pyspark.ml.tuning import ParamGridBuilder 

# Creates a ParamGridBuilder, and adds hyperparameters
param_grid = ParamGridBuilder().addGrid(als.rank, [5,10,20,40]).addGrid(als.maxIter, [5,10,15,20]).addGrid(als.regParam,[0.01,0.001,0.0001,0.02]) 

evaluator = RegressionEvaluator(metricName="rmse", labelCol="rating",predictionCol="prediction")

# Imports CrossValidator package
from pyspark.ml.tuning import CrossValidator 

# Creates cross validator and tells Spark what to use when training and evaluates
cv = CrossValidator(estimator = als,
                    estimatorParamMaps = param_grid,
                    evaluator = evaluator,
                    numFolds = 5) 

model = cv.fit(training) 

类型错误:类型“ParamGridBuilder”对象没有LEN()

完全错误日志:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<command-1952169986445972> in <module>()
----> 1 model = cv.fit(training)
      2 
      3 # Extract best combination of values from cross validation
      4 
      5 best_model = model.bestModel

/databricks/spark/python/pyspark/ml/base.py in fit(self, dataset, params)
    130                 return self.copy(params)._fit(dataset)
    131             else:
--> 132                 return self._fit(dataset)
    133         else:
    134             raise ValueError("Params must be either a param map or a list/tuple of param maps, "

/databricks/spark/python/pyspark/ml/tuning.py in _fit(self, dataset)
    279         est = self.getOrDefault(self.estimator)
    280         epm = self.getOrDefault(self.estimatorParamMaps)
--> 281         numModels = len(epm)
pyspark apache-spark-ml
1个回答
1
投票

它简单意味着你的对象没有length属性(不像列表)。因此,在你行

param_grid = ParamGridBuilder()
    .addGrid(als.rank, [5,10,20,40])
    .addGrid(als.maxIter, [5,10,15,20])
    .addGrid(als.regParam, [0.01,0.001,0.0001,0.02])

你应该在结尾处添加.build()真正构建一个网格。

© www.soinside.com 2019 - 2024. All rights reserved.