如何使用Apache星火ML库进行随机森林网格搜索

问题描述 投票:0回答:1

我想在我的随机森林模型的Apache星火进行网格搜索。但我没能找到一个例子这样做。是否有样本数据,我可以利用网格搜索做超参数整定任何的例子吗?

apache-spark apache-spark-mllib
1个回答
1
投票
from pyspark.ml import Pipeline
from pyspark.ml.classification import RandomForestClassifier
from pyspark.ml.evaluation import BinaryClassificationEvaluator
from pyspark.ml.tuning import CrossValidator, ParamGridBuilder


rf = RandomForestClassifier(labelCol="indexedLabel", featuresCol="indexedFeatures", numTrees=10)
pipeline = Pipeline(stages=[rf])
paramGrid = ParamGridBuilder().addGrid(rf.numTrees, [10, 30]).build()

crossval = CrossValidator(estimator=pipeline,
                          estimatorParamMaps=paramGrid,
                          evaluator=BinaryClassificationEvaluator(),
                          numFolds=2) 

cvModel = crossval.fit(training_df)

超参数和网格在addGrid方法定义

© www.soinside.com 2019 - 2024. All rights reserved.