为什么我用lme4和效果包生成的效果图看起来是错误的?

问题描述 投票:0回答:1

我使用 R 中的 lme4 包将广义线性混合模型拟合到我的数据集。模型公式和输出如下:

因为我发现“句号”和“附加语”之间存在显着的相互作用,所以我尝试通过提取它们的效果并绘制与包效果的相互作用来进行更深入的研究。我尝试使用以下脚本获得效果:

var.adSem <- data.frame(Effect(c("period", "AdjunctSem"), m7))
head(var.adSem)
var.adSem $ period <- droplevels(var.adSem $ period)
levels(var.adSem $ period)
var.adSem $ AdjunctSem <- droplevels(var.adSem $ AdjunctSem)
levels(var.adSem $ AdjunctSem)

这些是我提取的效果:

我尝试使用下面的脚本绘制效果:

perc <- function(x) {paste(100 * x, "%")}
Plot1<-(ggplot(var.adSem,aes(x=factor(period, level=c('period_1', 'period_2','period_3')),fit, shape=AdjunctSem))
        + geom_hline(yintercept=.5, linetype="dashed", alpha=.5)  
        + geom_errorbar(aes(ymin=fit-se,ymax=fit+se),width=.2,position = position_dodge(0.3))
        #    + annotate("rect", xmin = c(4.5, 6.5), xmax = c(5.5, 9.5),
        #           ymin = -Inf, ymax = Inf,
        #           alpha = 0.2, fill = c("#6ac2ee"))
        + geom_point(size=4, position = position_dodge(0.3))
        + labs(y="Probability of the ba-construction", x="period")
        + theme(#plot.title = element_text(size=20, face="bold", vjust=2),
          plot.title   = element_blank(),
          axis.title.x = element_text(size=14, vjust=-0.5),
          axis.text.x  = element_text(angle=90, hjust=1, vjust=0.5),
          axis.title.y = element_text(size=14, vjust=1.5),
          legend.position = "top")
        + scale_y_continuous(label=function(x){paste(100*x,"%")}, 
                             breaks=seq(0,1,.1),
                             limits=c(0,1)))

结果图如下所示:

从效果图来看,根本没有明显的交互作用。但是,模型输出清楚地表明存在显着的交互(如果您想访问数据集,请p<0.001). Is it because I made a mistake somewhere in the scripts? Please click this 链接

r ggplot2 regression logistic-regression effects
1个回答
0
投票

您使用的是二项式模型,这意味着模型摘要中报告的显着差异是“对数赔率”的差异。但是,这些已在 var.adSem 中转换为 概率

如果对数赔率很大,则等效概率接近于 1。大的负对数赔率将接近 0。这种关系不是线性的,因此,如果您将对数赔率视为概率,则高对数赔率将全部显示为接近 1 的压缩概率,而低对数赔率将显示为压缩概率接近于0。

我们可以通过采用代表对数赔率的向量并看看将它们转换为概率时会发生什么来说明这一点。首先,对数赔率:

library(ggplot2) df <- data.frame(x = seq(-10, 10), log_odds = seq(-10, 10)) ggplot(df, aes(x, log_odds)) + geom_line() + geom_point()

现在我们使用

plogis

将对数赔率转换为概率:

ggplot(df, aes(x, plogis(log_odds))) + geom_line() + geom_point()

请注意,即使整个范围内对数赔率不断变化,但在对数赔率高于 5 或低于对数赔率 -5 时,概率的变化变得很难看到,因为概率接近 0 或接近 1。

使用您自己的数据,这意味着如果您想证明组之间的差异,您需要将结果显示为对数赔率。您可以转换 y 轴,使其仍然显示概率,尽管概率非常接近 0 和 1,以至于百分比成为非常规选择:

ggplot(var.adSem, aes(x = period, y = qlogis(fit), shape = AdjunctSem)) + geom_hline(yintercept = qlogis(0.5), linetype = "dashed", alpha = 0.5) + geom_errorbar(aes(ymin = qlogis(lower), ymax = qlogis(upper)), width = 0.2, position = position_dodge(0.3)) + geom_point(size = 4, position = position_dodge(0.3)) + labs(x = "period") + theme(plot.title = element_blank(), axis.title.x = element_text(size = 14, vjust = -0.5), axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1), axis.title.y = element_text(size = 14, vjust = 1.5), legend.position = "top") + scale_y_continuous("Probability of the ba-construction", labels = ~scales::percent(plogis(.x)), breaks = qlogis(c(1e-9, 1e-6, 1e-3, 0.5, 1 - 1e-3, 1 - 1e-6, 1 - 1e-9)))

创建于 2023-08-06,使用

reprex v2.0.2


从相关表的 OCR 中获取的数据

var.adSem <- structure(list(period = c("period_2", "period_1", "period_3", "period_2", "period_1", "period_3", "period_2", "period_1", "period_3" ), AdjunctSem = c("adverbial", "adverbial", "adverbial", "complement", "complement", "complement", "withoutAdjunct", "withoutadjunct", "withoutadjunct"), fit = c(3.51860434e-05, 2.737536e-07, 0.9999899965756, 1.97090038e-05, 4.566828e-07, 0.9999909905965, 3.5261392e-05, 2.81571e-07, 0.9999885020845), se = c(3.46701653e-05, 5.871505e-07, 2.82682182e-05, 1.96541803e-05, 9.787531e-07, 2.56165218e-05, 3.48405297e-05, 6.051007e-07, 3.28484263e-05), lower = c(5.100729398e-06, 4.089689e-09, 0.997462272508378, 2.79139626e-06, 6.844419e-09, 0.997634501991402, 5.084413615e-06, 4.171972e-09, 0.996901835438947 ), upper = c(0.00024267862, 1.832405e-05, 0.99999996067, 0.00013914363, 3.047052e-05, 0.99999996577, 0.00024450078, 1.900319e-05, 0.99999995746 )), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9"))

© www.soinside.com 2019 - 2024. All rights reserved.