3D 图逻辑回归

Question

我已经能够根据 4 个变量之一绘制 4 个预测概率图（二元结果 0 或 1）：怀孕、年龄、BMI、血糖；然而我没有成功地应用类似的东西来获得3D图（x：一个var，y：其他var和z：表面的预测概率和实际数据的结果0或1），所以我希望我可以制作一个表面图模型的坐标以及实际坐标的点

这已经起作用了，现在是否可以对 3d 图甚至更好的 4d 图做一些类似的事情（根据真实数据值（散点）和建模值（表面）绘制 3 个变量，并将预测概率作为渐变上的颜色：使用这段代码：

# Assuming 'final_model' is your trained logistic regression model
exclude_vars <- c("BloodPressure", "Insulin", "SkinThickness", "DiabetesPedigreeFunction")
dataset$PP_model <- predict(final_model, newx = as.matrix(dataset[, !(names(dataset) %in% c("Outcome", exclude_vars))]), type = "response")


create_logistic_plot <- function(variable, data) {
  ggplot(data, aes_string(x = variable, y = "Outcome")) +
    geom_point(aes(color = factor(Outcome)), position = position_identity(), size = 2) +
    geom_smooth(aes(y = PP_model), method = "glm", method.args = list(family = "binomial"), se = FALSE) +
    labs(title = paste("Logistic Regression Plot for", variable),
         x = variable,
         y = "Outcome/Predicted Probability",
         color = "Outcome") +
    theme_minimal()
}
# Create logistic regression plots for each variable without jitter
plots <- lapply(c("Pregnancies", "Glucose", "BMI", "Age"), function(var) create_logistic_plot(var, dataset))

# Print the plots
plots

Answer 1

您当然可以使用 persp 同时绘制 3D 曲面图，显示两个预测变量的“边际效应”。这意味着，如果您的模型中有四个预测变量，您可以看到预测概率，其中任意两个变量发生变化，而另外两个变量保持平均值不变。

以下函数允许您绘制这样的图：

plot_probs <- function(model, var1, var2, n = 50, ...) { nm1 <- deparse(substitute(var1)) nm2 <- deparse(substitute(var2)) df <- model.frame(model) var1 <- df[[nm1]] var2 <- df[[nm2]] df <- df[-match(c(nm1, nm2), names(df))] preds <- list(x = seq(min(var1), max(var1), length = n), y = seq(min(var2), max(var2), length = n)) predlist <- setNames(c(preds, lapply(df, mean)), c(nm1, nm2, names(df))) pred_df <- do.call("expand.grid", predlist) Probability <- matrix(predict(model, newdata = pred_df, type = "response"), nrow = n) persp(x = preds$x, y = preds$y, z = Probability, xlab = nm1, ylab = nm2, ticktype = "detailed", zlim = c(0, 1), ...) }

例如，如果我们有以下模型：

model <- glm(Outcome ~ Age + Glucose + Pregnancies + BMI, binomial, dataset)

我们可以致电：

plot_probs(model, Glucose, BMI, theta = 45, phi = 30, col = "gold")

还有

plot_probs(model, Pregnancies, Age, theta = -45, phi = 30, col = "lightblue")

请注意，至少在这种情况下，4D 图几乎毫无用处。如果 x、y 和 z 坐标上有 3 个预测变量，那么概率将不是一个表面，而是一个填充整个空间的体积。是的，它可以是部分透明的，并根据概率着色，但是很难看到正在发生的事情，并且在打印页面上根本无法正常工作。它还需要保留“边际效应”图，因为其中一个预测变量需要保持不变才能进行任何预测。

使用的数据

在没有可重现示例的情况下，以下数据集应近似于 OP 使用的数据集 set.seed(1) dataset <- data.frame(Pregnancies = rpois(200, 1) + 1, BMI = round(rnorm(200, 25, 2.5), 1), Age = sample(18:45, 200, TRUE), Glucose = round(runif(200, 3.5, 12), 1)) p <- plogis(with(dataset, Age + Glucose + Pregnancies + BMI - 65)/5) dataset$Outcome <- rbinom(200, 1, p)

3D 图逻辑回归

问题描述投票：0回答：1

1个回答

最新问题

3D 图逻辑回归

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1