我的主要目标是进行调解(使用
mediation
R 包),同时考虑抽样权重(使用 survey
包)。这让我发现,适合 survey::svyglm
的模型和适合 survey::svyolr
的模型的权重存储方式不同。
set.seed(3459823)
library(survey)
library(srvyr)
library(tidyverse)
library(mediation)
n <- 100
# simulate data
dat <- tibble::tibble(
id = 1:100,
y_binary = sample(c(0,1), size = n, replace = TRUE),
m_binary = sample(c(0,1), size = n, replace = TRUE),
m_ordinal = sample(c(1, 2, 3), size = n, replace = TRUE, prob = c(0.2, 0.3, 0.5)) |>
as.factor() |>
as.ordered(),
pred_normal = rnorm(n = n),
weight = rnorm(n = n, mean = 15),
weight_rep1 = rnorm(n = n, mean = weight),
weight_rep2 = rnorm(n = n, mean = weight),
weight_rep3 = rnorm(n = n, mean = weight),
weight_rep4 = rnorm(n = n, mean = weight),
weight_rep5 = rnorm(n = n, mean = weight)
)
# create design object
dat_design <- dat |>
srvyr::as_survey_rep(
repweights = dplyr::contains("_rep"),
weights = weight,
combined_weights = TRUE
)
# ordinal mediator models
model_y_binary <- survey::svyglm(
formula = y_binary ~ m_ordinal + pred_normal,
design = dat_design,
family = "binomial"
)
model_m_ordinal <- survey::svyolr(
formula = m_ordinal ~ pred_normal,
design = dat_design
)
mediation_ordinal <- mediation::mediate(
model.m = model_m_ordinal,
model.y = model_y_binary,
sims = 50,
treat = "pred_normal",
mediator = "m_ordinal"
)
# Won't run!
# Error in mediation::mediate(model.m = model_m_ordinal, model.y = model_y_binary, :
# weights on outcome and mediator models not identical
然后,我使用
model.frame()
函数检查了模型对象中保存的权重,这也是 mediation::mediate
似乎在幕后用来提取权重的方法。
# Inspecting the weights saved by the survey models:
model.frame(model_y_binary) |> dplyr::slice_head()
# Result:
# y_binary m_ordinal pred_normal (weights)
# 1 1 3 -0.09479119 1.004946
# so, the binomial model stores the weights in a column in the model.frame
model.frame(model_m_ordinal) |> dplyr::slice_head()
# Result:
# m_ordinal pred_normal (weights)
# 1 3 -0.09479119 14.92443
# so, the ordinal model stores the weight as well, but they are different!
weight_comparison <- tibble::tibble(
binomial_weights = model.frame(model_y_binary)$`(weights`,
ordinal_weights = model.frame(model_m_ordinal)$`(weights)`,
weight_quotient = binomial_weights / ordinal_weights
)
weight_comparison |> dplyr::slice_head(n = 5)
# Result:
# binomial_weights ordinal_weights weight_quotient
# <dbl> <dbl> <dbl>
# 1 1.00 14.9 0.0673
# 2 1.03 15.2 0.0673
# 3 0.992 14.7 0.0673
# 4 1.01 14.9 0.0673
# 5 0.927 13.8 0.0673
# so, all the binomial weights seem to be equal to the ordinal weights times ~0.0673
为什么会出现这种情况?
有没有办法改变拟合
survey::svyolr
模型对象并强制权重等于存储在svyglm
模型对象中的权重?
是否可以更改
mediation
包以考虑到这一点?例如,仅从 model.m 中获取权重而不将它们与 model.y 进行比较? (当然这也可能导致其他问题。)
调查包中的某些函数重新调整权重以具有单位均值,因为某些国家调查中的大权重(数千或数万)可能会导致收敛问题。
由于结果根本不受影响,因此
mediate
包可能可以相当轻松地解决这个问题。但是,出于此类目的,有一个 rescale=FALSE
选项可以不重新调整权重。如果您在 svyglm
中遇到收敛问题,您可以在进行任何分析之前手动重新调整权重以获得单位均值。