我使用 R 中的 lme4 包将广义线性混合模型拟合到我的数据集。模型公式和输出如下:
因为我发现“句号”和“附加语”之间存在显着的相互作用,所以我尝试通过提取它们的效果并绘制与包效果的相互作用来进行更深入的研究。我尝试使用以下脚本获得效果:
var.adSem <- data.frame(Effect(c("period", "AdjunctSem"), m7))
head(var.adSem)
var.adSem $ period <- droplevels(var.adSem $ period)
levels(var.adSem $ period)
var.adSem $ AdjunctSem <- droplevels(var.adSem $ AdjunctSem)
levels(var.adSem $ AdjunctSem)
这些是我提取的效果:
我尝试使用下面的脚本绘制效果:
perc <- function(x) {paste(100 * x, "%")}
Plot1<-(ggplot(var.adSem,aes(x=factor(period, level=c('period_1', 'period_2','period_3')),fit, shape=AdjunctSem))
+ geom_hline(yintercept=.5, linetype="dashed", alpha=.5)
+ geom_errorbar(aes(ymin=fit-se,ymax=fit+se),width=.2,position = position_dodge(0.3))
# + annotate("rect", xmin = c(4.5, 6.5), xmax = c(5.5, 9.5),
# ymin = -Inf, ymax = Inf,
# alpha = 0.2, fill = c("#6ac2ee"))
+ geom_point(size=4, position = position_dodge(0.3))
+ labs(y="Probability of the ba-construction", x="period")
+ theme(#plot.title = element_text(size=20, face="bold", vjust=2),
plot.title = element_blank(),
axis.title.x = element_text(size=14, vjust=-0.5),
axis.text.x = element_text(angle=90, hjust=1, vjust=0.5),
axis.title.y = element_text(size=14, vjust=1.5),
legend.position = "top")
+ scale_y_continuous(label=function(x){paste(100*x,"%")},
breaks=seq(0,1,.1),
limits=c(0,1)))
结果图如下所示:
从效果图来看,根本没有明显的交互作用。但是,模型输出清楚地表明存在显着的交互(如果您想访问数据集,请p<0.001). Is it because I made a mistake somewhere in the scripts? Please click this 链接。
您使用的是二项式模型,这意味着模型摘要中报告的显着差异是“对数赔率”的差异。但是,这些已在 var.adSem
中转换为 概率。
我们可以通过采用代表对数赔率的向量并看看将它们转换为概率时会发生什么来说明这一点。首先,对数赔率:
library(ggplot2)
df <- data.frame(x = seq(-10, 10), log_odds = seq(-10, 10))
ggplot(df, aes(x, log_odds)) + geom_line() + geom_point()
现在我们使用
plogis
将对数赔率转换为概率:
ggplot(df, aes(x, plogis(log_odds))) + geom_line() + geom_point()
请注意,即使整个范围内对数赔率不断变化,但在对数赔率高于 5 或低于对数赔率 -5 时,概率的变化变得很难看到,因为概率接近 0 或接近 1。
使用您自己的数据,这意味着如果您想证明组之间的差异,您需要将结果显示为对数赔率。您可以转换 y 轴,使其仍然显示概率,尽管概率非常接近 0 和 1,以至于百分比成为非常规选择:
ggplot(var.adSem, aes(x = period, y = qlogis(fit), shape = AdjunctSem)) +
geom_hline(yintercept = qlogis(0.5), linetype = "dashed", alpha = 0.5) +
geom_errorbar(aes(ymin = qlogis(lower), ymax = qlogis(upper)), width = 0.2,
position = position_dodge(0.3)) +
geom_point(size = 4, position = position_dodge(0.3)) +
labs(x = "period") +
theme(plot.title = element_blank(),
axis.title.x = element_text(size = 14, vjust = -0.5),
axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1),
axis.title.y = element_text(size = 14, vjust = 1.5),
legend.position = "top") +
scale_y_continuous("Probability of the ba-construction",
labels = ~scales::percent(plogis(.x)),
breaks = qlogis(c(1e-9, 1e-6, 1e-3, 0.5,
1 - 1e-3, 1 - 1e-6, 1 - 1e-9)))
var.adSem <- structure(list(period = c("period_2", "period_1", "period_3",
"period_2", "period_1", "period_3", "period_2", "period_1", "period_3"
), AdjunctSem = c("adverbial", "adverbial", "adverbial", "complement",
"complement", "complement", "withoutAdjunct", "withoutadjunct",
"withoutadjunct"), fit = c(3.51860434e-05, 2.737536e-07, 0.9999899965756,
1.97090038e-05, 4.566828e-07, 0.9999909905965, 3.5261392e-05,
2.81571e-07, 0.9999885020845), se = c(3.46701653e-05, 5.871505e-07,
2.82682182e-05, 1.96541803e-05, 9.787531e-07, 2.56165218e-05,
3.48405297e-05, 6.051007e-07, 3.28484263e-05), lower = c(5.100729398e-06,
4.089689e-09, 0.997462272508378, 2.79139626e-06, 6.844419e-09,
0.997634501991402, 5.084413615e-06, 4.171972e-09, 0.996901835438947
), upper = c(0.00024267862, 1.832405e-05, 0.99999996067, 0.00013914363,
3.047052e-05, 0.99999996577, 0.00024450078, 1.900319e-05, 0.99999995746
)), class = "data.frame", row.names = c("1", "2", "3", "4", "5",
"6", "7", "8", "9"))