是否有可能从数个模型中提取出适合乘以推算数据的汇总估计值?
下面是我如何为完整的案例数据框执行此操作(即没有丢失数据) - 我想做一个类似的过程来提取适合估算数据的几个模型的类似结果:
library(tidyverse)
library(broom)
library(mice)
data <- nhanes
sapply(data, function(x) sum(is.na(x))) #check missing data
data <- data %>% filter(bmi !="NA" & hyp != "NA" & chl != "NA") # remove missing data
out <-c("bmi")
exp <- c("chl","age","factor(hyp)")
#run models and extract to tidy data frame
models <- expand.grid(out, exp) %>%
group_by(Var1) %>% rowwise() %>%
summarise(frm = paste0(Var1, "~", Var2)) %>%
group_by(model_id = row_number(),frm) %>%
do(tidy(lm(.$frm, data = data))) %>%
mutate(lci = estimate-(1.96*std.error),
uci = estimate+(1.96*std.error))
下面是使用mice
输入缺失数据并仅拟合单个回归模型的示例:
# Impute missing data using mice
data <- nhanes
imp <- mice(data, print = F)
#Fit single model
fit <- with(imp, lm(bmi ~ chl))
#Get pooled estimates
a <- pool(fit)
summary(a)
这里的关键点是从complete(imp, "long")
开始,因为它提供了所有插补数据集。在这之后,你必须玩一些tidyverse
和broom
函数,尤其是nest()
和tidy()
,这在这里非常有用。试试这个:
library(tidyverse)
library(broom)
library(mice)
data <- nhanes # data
imp <- mice(data, print = F) # imputation
# complete data
data.complete <- complete(imp, "long")
glimpse(data.complete) # all the 5 imputations are here
data.complete %>%
select(-.id) %>%
nest(-.imp) %>%
mutate(model = map(data, ~lm(bmi ~ chl, data = .)),
tidied = map(model, tidy)) %>%
unnest(tidied) %>%
filter(term == "chl") %>%
mutate(adjusted = p.adjust(p.value),
lci = estimate-(1.96*std.error),
uci = estimate+(1.96*std.error))
# output
.imp term estimate std.error statistic p.value adjusted lci uci
1 1 chl 0.01972747 0.01755024 1.124057 0.272584078 0.45430932 -0.0146709916 0.05412594
2 2 chl 0.02133664 0.01719462 1.240891 0.227154661 0.45430932 -0.0123648105 0.05503808
3 3 chl 0.03070542 0.01534959 2.000407 0.057397701 0.22512674 0.0006202261 0.06079062
4 4 chl 0.04109955 0.02044568 2.010183 0.056281686 0.22512674 0.0010260220 0.08117308
5 5 chl 0.05448964 0.01585764 3.436175 0.002251967 0.01125984 0.0234086522 0.08557062