我正在尝试以对数奇数比例绘制逻辑回归的结果。
load(url("https://github.com/bossaround/question/raw/master/logisticregressdata.RData"))
ggplot(D, aes(Year, as.numeric(Vote), color = as.factor(Male))) +
stat_smooth( method="glm", method.args=list(family="binomial"), formula = y~x + I(x^2), alpha=0.5, size = 1, aes(fill=as.factor(Male))) +
xlab("Year")
但是这个图是在 0~1 范围内的。我想这就是概率尺度(如果我错了请纠正我)?
我真正想要的是在将其转换为概率之前将其绘制在对数奇数刻度上,就像逻辑回归报告一样。
理想情况下,我想在控制外国后,在这样的模型中绘制男性的投票和年份之间的关系:
Model <- glm(Vote ~ Year + I(Year^2) + Male + Foreign, family="binomial", data=D)
我可以根据
summary(Model)
手动绘制线条,但我也想绘制置信区间。
类似于我在网上找到的本文档第 44 页上的图片:http://www.datavis.ca/papers/CARME2015-2x2.pdf。我的会有二次曲线。
谢谢!
要绘制具有多个变量的模型的预测,应该创建模型,预测新数据以生成预测并绘制该预测
Model <- glm(Vote ~ Year + I(Year^2) + Male + Foreign, family="binomial", data=D)
for_pred = expand.grid(Year = seq(from = 2, to = 10, by = 0.1), Male = c(0,1), Foreign = c(0,1)) #generate data to get a smooth line
for_pred = cbind(for_pred, predict(Model, for_pred, type = "link", se.fit= T))
#if the probability scale was needed: `type = "response`
library(ggplot2)
ggplot(for_pred, aes(Year, fit, color = as.factor(Male))) +
geom_line() +
xlab("Year")+
facet_wrap(~Foreign) + #important step - check also how it looks without it
geom_ribbon(aes(ymax = fit + se.fit, ymin = fit - se.fit, fill = as.factor(Male)), alpha = 0.2)
#omit the color by `color = NA` or by `inherit.aes = F` (if like this, one should provide the data and full `aes` mapping for geom_ribbon).
#If geom_ribbon should not have a mapping, specify `fill` outside of `aes` like: `fill = grey80`.
查看库 sjPlot
使用
augmented()
中的 broom()
进一步回答:
Model <- glm(Vote ~ Year + I(Year^2) + Male + Foreign, family="binomial", data=D)
summary(Model)
# augmented data frame
model.df = augment(Model) %>% rename(log_odds = `.fitted`,
Sex = Male)
glimpse(model.df.1)
Observations: 46,398
Variables: 13
$ .rownames <chr> "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21"...
$ Vote <dbl> 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, ...
$ Year <int> 2, 3, 4, 5, 2, 3, 2, 3, 4, 5, 6, 2, 2, 2, 3, 4, 2, 3, 4, 5, 6, 7, 8, 9, 2, 3, 4, 5, 2, 3, 2, 3, 4, 5, 2, 3, 4, 5, ...
$ I.Year.2. <S3: AsIs> 4, 9, 16, 25, 4, 9, 4, 9, 16, 25, 36, 4, 4, 4, 9, 16, 4, 9, 16, 25, 36, 49, 64, 81, 4, 9, 16, 2...
$ Sex <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
$ Foreign <int> 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
$ log_odds <dbl> -0.01910985, -0.68184753, -1.14317053, -1.40307885, 0.26930939, -0.39342829, -0.01910985, -0.68184753, -1.14317053...
$ .se.fit <dbl> 0.01675017, 0.01466136, 0.01790972, 0.02058514, 0.01931826, 0.01777691, 0.01675017, 0.01466136, 0.01790972, 0.0205...
$ .resid <dbl> -1.1693057, -0.9047053, -0.7439452, 1.8016037, -1.2937083, 1.3483961, -1.1693057, -0.9047053, -0.7439452, -0.66303...
$ .hat <dbl> 7.013561e-05, 4.794678e-05, 5.879536e-05, 6.711744e-05, 9.162739e-05, 7.602458e-05, 7.013561e-05, 4.794678e-05, 5....
$ .sigma <dbl> 1.124879, 1.124884, 1.124886, 1.124861, 1.124876, 1.124874, 1.124879, 1.124884, 1.124886, 1.124887, 1.124860, 1.12...
$ .cooksd <dbl> 1.376354e-05, 4.849628e-06, 3.749311e-06, 5.461011e-05, 2.399355e-05, 2.253792e-05, 1.376354e-05, 4.849628e-06, 3....
$ .std.resid <dbl> -1.1693467, -0.9047270, -0.7439671, 1.8016642, -1.2937676, 1.3484474, -1.1693467, -0.9047270, -0.7439671, -0.66305...
#visualise
ggplot(model.df.1, aes(Year, log_odds, colour = Sex)) +
geom_line() +
geom_smooth(se = TRUE) +
facet_wrap( ~ Foreign)
这给出了:
您的方法是正确的,但您需要使用您构建的模型来预测值,如下所示:
ModelPredictions <- predict(Model , type="response")
之后,您可以使用 ggplot 进行绘图:
ggplot(D, aes(x=ModelPredictions , y=D$Vote )) +
geom_point() + stat_smooth(method="glm", se=FALSE, method.args = list(family=binomial)) + facet_wrap( ~ Foreign)
查看 visreg 包:https://pbreheny.github.io/visreg/index.html
它正是您正在寻找的东西,并且语法非常简单,您只需将 glm 插入其中并告诉它您想要显示哪些变量