“LogisticRegressionTrainingSummary”对象没有属性“fMeasureByThreshold”

问题描述 投票:0回答:1

我是 Pyspark 和 Databricks 的新手,正在尝试创建 Logistic 回归模型(通过 Databrticks 本身提供的 Spark_DS&ML_exercise)。将模型拟合到我的训练数据后。我正在尝试通过阈值从摘要中获取 f 度量。

我运行了以下代码:

from pyspark.ml.classification import LogisticRegression

# Create initial LogisticRegression model
lr = LogisticRegression(labelCol="label", featuresCol="features", maxIter=10)

# set threshold for the probability above which to predict a 1
lr.setThreshold(train_positive_rate)
# lr.setThreshold(0.5) # could use this if knew you had balanced data

# Train model with Training Data
lrModel = lr.fit(train)

# get training summary used for eval metrics and other params
lrTrainingSummary = lrModel.summary

# Find the best model threshold if you would like to use that instead of the empirical positve rate
fMeasure = lrTrainingSummary.fMeasureByThreshold

但是我收到了这个 AttributeError:

AttributeError: 'LogisticRegressionTrainingSummary' object has no attribute 'fMeasureByThreshold'

看来

fMeasureByThreshold
已经不存在了。是这样吗?

machine-learning pyspark databricks logistic-regression apache-spark-mllib
1个回答
0
投票

所以,我找到了答案。我正在寻找的方法位于

BinaryLogisticRegressionTrainingSummary
而不是
LogisticRegressionTrainingSummary
。前者是二项式逻辑回归模型的总结(我的是多项式)。对于后者,获得 f 测量的唯一方法是使用
fMeasureByLabel
进行标签。

© www.soinside.com 2019 - 2024. All rights reserved.