如何使用ggplot2在对数奇数尺度上绘制逻辑回归

问题描述 投票:0回答:4

我正在尝试以对数奇数比例绘制逻辑回归的结果。

load(url("https://github.com/bossaround/question/raw/master/logisticregressdata.RData"))

ggplot(D, aes(Year, as.numeric(Vote), color = as.factor(Male))) +
  stat_smooth( method="glm", method.args=list(family="binomial"), formula = y~x + I(x^2), alpha=0.5, size = 1, aes(fill=as.factor(Male))) +
  xlab("Year") 

但是这个图是在 0~1 范围内的。我想这就是概率尺度(如果我错了请纠正我)?

我真正想要的是在将其转换为概率之前将其绘制在对数奇数刻度上,就像逻辑回归报告一样。

理想情况下,我想在控制外国后,在这样的模型中绘制男性的投票和年份之间的关系:

   Model <-  glm(Vote ~ Year + I(Year^2) + Male + Foreign, family="binomial", data=D)

我可以根据

summary(Model)
手动绘制线条,但我也想绘制置信区间。

类似于我在网上找到的本文档第 44 页上的图片:http://www.datavis.ca/papers/CARME2015-2x2.pdf。我的会有二次曲线。

谢谢!

r ggplot2 logistic-regression
4个回答
3
投票

要绘制具有多个变量的模型的预测,应该创建模型,预测新数据以生成预测并绘制该预测

Model <-  glm(Vote ~ Year + I(Year^2) + Male + Foreign, family="binomial", data=D)
for_pred = expand.grid(Year = seq(from = 2, to = 10, by = 0.1), Male = c(0,1), Foreign = c(0,1)) #generate data to get a smooth line

for_pred = cbind(for_pred, predict(Model, for_pred, type = "link", se.fit= T)) 
#if the probability scale was needed: `type = "response`

library(ggplot2)
ggplot(for_pred, aes(Year, fit, color = as.factor(Male))) +
  geom_line() +
  xlab("Year")+
  facet_wrap(~Foreign)  + #important step - check also how it looks without it
  geom_ribbon(aes(ymax = fit + se.fit, ymin = fit - se.fit, fill = as.factor(Male)), alpha = 0.2) 

#omit the color by `color = NA` or by `inherit.aes = F` (if like this, one should provide the data and full `aes` mapping for  geom_ribbon). 
#If geom_ribbon should not have a mapping, specify `fill` outside of `aes` like: `fill = grey80`.

查看库 sjPlot


2
投票

使用

augmented()
中的
broom()
进一步回答:

Model <-  glm(Vote ~ Year + I(Year^2) + Male + Foreign, family="binomial", data=D)
summary(Model)


# augmented data frame

model.df = augment(Model) %>% rename(log_odds = `.fitted`, 
                                       Sex = Male)

glimpse(model.df.1)
Observations: 46,398
Variables: 13
$ .rownames  <chr> "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21"...
$ Vote       <dbl> 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, ...
$ Year       <int> 2, 3, 4, 5, 2, 3, 2, 3, 4, 5, 6, 2, 2, 2, 3, 4, 2, 3, 4, 5, 6, 7, 8, 9, 2, 3, 4, 5, 2, 3, 2, 3, 4, 5, 2, 3, 4, 5, ...
$ I.Year.2.  <S3: AsIs>  4,  9, 16, 25,  4,  9,  4,  9, 16, 25, 36,  4,  4,  4,  9, 16,  4,  9, 16, 25, 36, 49, 64, 81,  4,  9, 16, 2...
$ Sex        <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
$ Foreign    <int> 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
$ log_odds   <dbl> -0.01910985, -0.68184753, -1.14317053, -1.40307885, 0.26930939, -0.39342829, -0.01910985, -0.68184753, -1.14317053...
$ .se.fit    <dbl> 0.01675017, 0.01466136, 0.01790972, 0.02058514, 0.01931826, 0.01777691, 0.01675017, 0.01466136, 0.01790972, 0.0205...
$ .resid     <dbl> -1.1693057, -0.9047053, -0.7439452, 1.8016037, -1.2937083, 1.3483961, -1.1693057, -0.9047053, -0.7439452, -0.66303...
$ .hat       <dbl> 7.013561e-05, 4.794678e-05, 5.879536e-05, 6.711744e-05, 9.162739e-05, 7.602458e-05, 7.013561e-05, 4.794678e-05, 5....
$ .sigma     <dbl> 1.124879, 1.124884, 1.124886, 1.124861, 1.124876, 1.124874, 1.124879, 1.124884, 1.124886, 1.124887, 1.124860, 1.12...
$ .cooksd    <dbl> 1.376354e-05, 4.849628e-06, 3.749311e-06, 5.461011e-05, 2.399355e-05, 2.253792e-05, 1.376354e-05, 4.849628e-06, 3....
$ .std.resid <dbl> -1.1693467, -0.9047270, -0.7439671, 1.8016642, -1.2937676, 1.3484474, -1.1693467, -0.9047270, -0.7439671, -0.66305...


#visualise

        ggplot(model.df.1, aes(Year, log_odds, colour = Sex)) + 
        geom_line() + 
        geom_smooth(se = TRUE) +
       facet_wrap( ~ Foreign)

这给出了:


0
投票

您的方法是正确的,但您需要使用您构建的模型来预测值,如下所示:

ModelPredictions <- predict(Model , type="response")

之后,您可以使用 ggplot 进行绘图:

ggplot(D, aes(x=ModelPredictions , y=D$Vote )) +
  geom_point()  +  stat_smooth(method="glm", se=FALSE, method.args = list(family=binomial)) +  facet_wrap( ~ Foreign)

0
投票

查看 visreg 包:https://pbreheny.github.io/visreg/index.html

它正是您正在寻找的东西,并且语法非常简单,您只需将 glm 插入其中并告诉它您想要显示哪些变量

© www.soinside.com 2019 - 2024. All rights reserved.