预测的SE与ggeffects :: ggpredict不同-我做错了什么?预测功能的“未找到对象”

问题描述 投票:0回答:1

我正在对树冠覆盖率(OverheadCover,以0.1为界的比例)和放置在同一位置的尸体数量(CarcassNumber,具有2个水平的因子)对比例的影响进行r分析鸟吃掉的腐肉的比例(ProportionBirdsScavenging,比例以0,1为界)。我通过为OverheadCover的各个值建模ProportionBirdsScavengingCarcassNumber的影响来绘制此交互关系,然后将其绘制在同一张图中。完成此操作后,我看到了由plot_model(glmm_interaction, type = "int")计算出的SE与我计算出的SE的差异。在这里开始调查。这段旅程使我了解了plot_modelplot_type_intggpredict的源代码,并以ggpredict_helper停下来。不幸的是,我确实没有找到SE的计算,但是我找到了SE的差异的证明。我一直很自信自己可以正确计算出SE,但是现在我不太确定了。请在下面查看我的代码。

#rm(list = ls())

library(glmmTMB)
library(dplyr)

data_both <- data.frame(ProportionBirdsScavenging = c(0.406192519926425, 0.871428571428571, 0.452995391705069, 0.484821428571429, 0.795866569978245, 0.985714285714286, 0.208571428571429, 0.573982970671712, 0.694285714285714, 0.930204081632653, 0.0483709273182957, 0.0142857142857143, 0.661904761904762, 0.985714285714286, 0.0142857142857143, 0.0142857142857143),
                        pointWeight = c(233, 17, 341, 128, 394, 46, 5, 302, 10, 35, 57, 39, 12, 229, 28, 116),
                        OverheadCover = c(0.671, 0.04, 0.46, 0.65, 0.02, 0, 0.8975, 0.585, 0.6795, 0.0418, 0.5995, 0.6545, 0.02, 0, 0.92, 0.585),
                        CarcassNumber = as.factor(c(1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2)),
                        Area = c("Hamert", "KempenBroek", "KempenBroek", "KempenBroek", "Markiezaat", "Markiezaat", "Meinweg", "Valkenhorst", "Hamert", "KempenBroek", "KempenBroek", "KempenBroek", "Markiezaat", "Markiezaat", "Meinweg", "Valkenhorst"))
data_both$pointWeight_scaled <- scales::rescale(data_both$pointWeight, to = c(0.0001,1)) # rescale weights

glmm_interaction <- glmmTMB(ProportionBirdsScavenging ~ OverheadCover * CarcassNumber + (1|Area), data = data_both, beta_family(link = "logit"), weights = pointWeight_scaled)
ggeffects::ggpredict(glmm_interaction, terms = c("OverheadCover", "CarcassNumber [1:2]")) # SE's calculated by r

# Calculate the SE's for CarcassNumber 1
df_first_carcasses <- filter(data_both, CarcassNumber == 1) # create df with only first carcasses
myglmm <- glmmTMB(ProportionBirdsScavenging ~ OverheadCover, data = df_first_carcasses, beta_family(link = "logit"), weights = pointWeight_scaled)
new.xglmm <- expand.grid(OverheadCover = seq(min(data_both$OverheadCover), max(data_both$OverheadCover), length.out = 1000)) %>%
  mutate(Area = "Hamert", pointWeight_scaled = 1) # pad new.xglmm with an arbitrary value for Area and pointWeight_scaled, then exclude them in predict, otherwise error -> https://stackoverflow.com/questions/54411851/mgcv-how-to-use-exclude-argument-in-predict-gam
new.yglmm <- data.frame(predict(myglmm, new.xglmm, type = "link", exclude = c("Area","pointWeight_scaled"), se.fit = TRUE)) %>% # exclude Area and pointWeight_scaled from the prediction
  mutate(ProportionBirdsScavenging = plogis(fit)) %>% # calculate the ProportionBirdsScavenging on response scale
  rename(SE.untransformed = se.fit, untransformed.predictions = fit)
addTheseglmm1 <- mutate(data.frame(new.xglmm, new.yglmm), # calculate the lwr and upr bounds using the untransformed predictions and SE, then transformed to probability
                        lwr = plogis(untransformed.predictions - SE.untransformed),
                        upr = plogis(untransformed.predictions + SE.untransformed))
addTheseglmm1[c(1,45,501,653,707,1000),c(1,6,5)] # compare my SE's to the ggpredict SE's

#  calculated by ggpredict   >   manually calculated
#    x | Predicted |   SE |  >      SE.untransformed
# -------------------------  >   -------------------
# 0.00 |      0.82 | 0.44 |  >                  0.43
# 0.04 |      0.80 | 0.41 |  >                  0.40
# 0.46 |      0.56 | 0.22 |  >                  0.21
# 0.60 |      0.47 | 0.26 |  >                  0.25
# 0.65 |      0.44 | 0.29 |  >                  0.28
# 0.92 |      0.28 | 0.47 |  >                  0.46
#  ( CarcassNumber = 1 )

# Same for CarcassNumber 2
df_second_carcasses <- filter(data_both, CarcassNumber == 2) # create df with only second carcasses
myglmm <- glmmTMB(ProportionBirdsScavenging ~ OverheadCover, data = df_second_carcasses, beta_family(link = "logit"), weights = pointWeight_scaled)
new.xglmm <- expand.grid(OverheadCover = seq(min(data_both$OverheadCover), max(data_both$OverheadCover), length.out = 1000)) %>%
  mutate(Area = "Hamert", pointWeight_scaled = 1) 
new.yglmm <- data.frame(predict(myglmm, new.xglmm, type = "link", exclude = c("Area","pointWeight_scaled"), se.fit = TRUE)) %>% 
  mutate(ProportionBirdsScavenging = plogis(fit)) %>%
  rename(SE.untransformed = se.fit, untransformed.predictions = fit)
addTheseglmm2 <- mutate(data.frame(new.xglmm, new.yglmm),
                        lwr = plogis(untransformed.predictions - SE.untransformed),
                        upr = plogis(untransformed.predictions + SE.untransformed))

addTheseglmm2[c(1,45,501,653,707,1000),c(1,6,5)]

#  calculated by ggpredict   >   manually calculated
#    x | Predicted |   SE |  >      SE.untransformed
# -------------------------  >   -------------------
# 0.00 |      0.96 | 1.01 |  >                  1.21
# 0.04 |      0.95 | 0.94 |  >                  1.10
# 0.46 |      0.16 | 0.86 |  >                  0.96
# 0.60 |      0.04 | 1.11 |  >                  1.32
# 0.65 |      0.02 | 1.22 |  >                  1.46
# 0.92 |      0.00 | 1.83 |  >                  2.31
#  ( CarcassNumber = 2 )

CarcassNumber的差异最明显。2.我的计算不正确吗?我怀疑这种差异可能与pointWeight_scaled函数中predict的包含和排除有关。如果我不这样做,它将返回一个错误,提示Error in eval(extras, data, env) : object 'pointWeight_scaled' not found。这是个常见的问题吗?我在这里读到,包含然后排除它可以解决问题mgcv: How to use 'exclude' argument in predict.gam?。这不是正确的方法吗?

我希望有人可以阐明这个问题。

r prediction confidence-interval
1个回答
0
投票

hm,当我运行您的示例时,得到的结果与您期望的相同:

© www.soinside.com 2019 - 2024. All rights reserved.