我曾经用lapply来模拟各种大小和“形状”的对数伽玛数据,然后将伽玛和对数正态分布拟合到这些模拟数据。
这是我的代码:
set.seed(100)
#One Predictor
b_0 <- 0.1
b_1 <- 0.2
#produce design matrix to enumerate every combination of size and shape parameter for underlying simulated data
design.matrix <- expand.grid("size" = c(5,10,15,30), "alpha" = seq(0.1, 1, by=0.1))
#Log-Gamma GLMs
gm_glog <- lapply(1:nrow(design.matrix), function(row.i) {
# Get values for current row
size.i <- design.matrix$size[row.i] #size specified in current row of design matrix
alpha.i <- design.matrix$alpha[row.i] #alpha specified in current row of design matrix
x_i <- runif(size.i, 0, 1) #draw a sample of size 'size'
y.true <- exp(b_0 + b_1*x_i) #produce log gamma data
y_i <- rgamma(size.i, rate = alpha.i/y.true, shape = alpha.i) #random gamma sample produced according to shape and size values of current row
#Gamma Model
glm(y_i ~ x_i,
family = Gamma(link = "log"),
control = glm.control(maxit=100,
trace = TRUE),
start = c(0.1, 0.2))
} )
#Log-Normal GLMs
lnorm_glog <- lapply(1:nrow(design.matrix), function(row.i) {
# Get values for current row
size.i <- design.matrix$size[row.i] #size specified in current row of design matrix
alpha.i <- design.matrix$alpha[row.i] #alpha specified in current row of design matrix
x_i <- runif(size.i, 0, 1) #draw a sample of size 'size'
y.true <- exp(b_0 + b_1*x_i) #produce log gamma data
y_i <- rgamma(size.i, rate = alpha.i/y.true, shape = alpha.i) #random gamma sample produced according to shape and size values of current row
#Lognormal Model
glm(y_i ~ x_i, family = gaussian(link = "log"), control = glm.control(maxit=500, trace = TRUE), start = c(0.1, 0.2))
} )
我跑步时
lapply(gm_glog, summary)
和lapply(lnorm_glog, summary)
我获得了基础模拟数据中样本大小和alpha的每种组合的glm拟合摘要列表(每个拟合40项摘要)。
[我现在的问题是,我想在单个表中对这些回归结果进行并排比较,其中设计矩阵的每一行[1]对应于lapply的第一个元素(gm_glog,摘要)和lapply(lnorm_glog,摘要),然后对于row [2],一直到row [40]。
理想情况下,看起来像
大小| alpha |摘要gamma glm |摘要lognormal glm
共有40行,每行分别对应大小和alpha。
基本上,我只想合并design.matrix,gm_glog的摘要和lnorm_glog的摘要。
不幸的是,处理glm摘要列表很困难,我找不到一种像想要的那样逐行合并这些结果的方法。
我已经看到使用lapply(model, tidy)
和lapply(model, glance)
为我提供所有这些摘要所需的所有信息,但是这两项都给我留下了数据帧列表,并且逐行组合它们也使我难以理解。如果要使用此方法,我仍然想将lapply(model,tidy)的row [1]与lapply(model,glance)的row [1],lapply(model,tidy)的row [2]与lapply(model,glance)等的row [2],即使每个列表的行都是不同尺寸的小对象。
我如何最好地做到这一点?有没有更简单的方法可以实现我想要的?
编辑:
根据建议,我需要指定要保留摘要的哪些部分,我想保留
第二次编辑:我已经设法获得了偏差残差
b <- lapply(gm_glog, glance)
lapply(b, deviance)
但这仍然给我留下了一个单元素列表的列表。仍然不确定如何将它们合并到AIC值等。
考虑使用Map
的元素明智循环(包装到mapply
来构建数据帧列表,并在每次迭代中运行两个模型,然后将summary
的所需组件提取到数据帧:
定义的方法
log_models <- function(size.i, alpha.i) {
x_i <- runif(size.i, 0, 1) # draw a sample of size 'size'
y.true <- exp(b_0 + b_1*x_i) # produce log gamma data
y_i <- rgamma(size.i, rate = alpha.i/y.true, shape = alpha.i) # random gamma sample
# Gamma Model
log_gamma_model <- glm(y_i ~ x_i, family = Gamma(link = "log"),
control = glm.control(maxit=100, trace = TRUE),
start = c(0.1, 0.2))
log_gamma_summ <- summary(log_gamma_model)
# Lognormal Model
log_norm_model <- glm(y_i ~ x_i, family = gaussian(link = "log"),
control = glm.control(maxit=500, trace = TRUE),
start = c(0.1, 0.2))
log_norm_summ <- summary(log_norm_model)
# DATA FRAME BUILD
data.frame(size = size.i,
alpha = alpha.i,
gamma_mod_int = log_gamma_summ$coefficients["(Intercept)", "Estimate"],
gamma_mod_est = log_gamma_summ$coefficients["x_i", "Estimate"],
gamma_mod_aic = log_gamma_summ$aic,
gamma_mod_dev = log_gamma_summ$deviance.resid[length(log_gamma_summ$deviance.resid)],
gamma_mod_shape = MASS::gamma.shape(log_gamma_model)$alpha,
norm_mod_int = log_norm_summ$coefficients["(Intercept)", "Estimate"],
norm_mod_est = log_norm_summ$coefficients["x_i", "Estimate"],
norm_mod_aic = log_norm_summ$aic,
norm_mod_dev = log_norm_summ$deviance.resid[length(log_norm_summ$deviance.resid)]
)
}
Map
/ mapply
通话
df_list <- Map(log_models, design.matrix$size, design.matrix$alpha)
# df_list <- mapply(log_models, design.matrix$size, design.matrix$alpha, SIMPLIFY=FALSE)
final_df <- do.call(rbind, df_list)
输出
final_df
# size alpha gamma_mod_int gamma_mod_est gamma_mod_aic gamma_mod_dev gamma_mod_shape norm_mod_int norm_mod_est norm_mod_aic norm_mod_dev
# 5 5 0.1 -2.39484838 3.808953 2.349387 1.6062347 0.25294152 -0.3943182 0.4366572 21.50163 2.2462398978
# 10 10 0.1 -0.03146698 -1.752435 -48.768787 -2.4685411 0.15839450 -769.8179792 797.7937171 16.72900 0.0073639677
# 15 15 0.1 -6.22434742 11.420125 -146.836144 2.7585789 0.11692945 -0.1601247 1.6135214 102.27202 22.0098432208
# 30 30 0.1 0.26381051 1.067361 -298.873575 -4.7725793 0.08641668 0.2565112 1.0687070 195.59417 -1.7643885736
# 51 5 0.2 -12.23809196 12.760998 -52.109115 0.0412409 0.31666275 -11.1636898 11.2453833 -48.17426 0.0006702163
# 101 10 0.2 1.51817293 -6.261376 -91.417016 -0.7455693 0.12372107 -0.4463434 -1.1394914 31.86825 -0.1580558441
# 151 15 0.2 -0.54878568 3.672312 -17.724359 -1.0910863 0.14922850 -2.7737690 6.2481058 101.48735 0.0621486528
# 301 30 0.2 0.84636917 -1.208503 -25.603596 0.1811917 0.19949756 0.6339933 -0.6533998 168.03056 0.0819567624
# 52 5 0.3 -0.45653740 -2.541001 4.907533 0.8486617 0.66655843 -0.7883221 -0.7289522 10.27774 0.4708082262
# 102 10 0.3 0.70548641 -2.790209 13.450575 0.3375955 0.54226062 1.3245745 -9.0701981 24.19732 -0.8978180162
...
解决此问题的方法是先查看:
str(gm_glog[[1]]
...。并确定所需项目的名称:
对于截距和坡度:
do.call( rbind, sapply(gm_glog, function(x){ x[c("coefficients")]}) )
(Intercept) x_i
coefficients 2.33991821 -20.7836582
coefficients 13.33466647 -31.4034737
coefficients 2.24020883 -3.1949161
coefficients -1.41151531 1.0243415
coefficients -0.81649523 1.2787418
coefficients -1.53695481 0.7518618
coefficients -4.86985066 7.5985577
snipped the rest
关于AIC和偏差残差值:
这是一种用于返回矩阵的列的方法,该列是AIC(在列表项“ aic”中找到)和剩余偏差(在列表项“ deviance”中找到)值。一如既往,R从sapply
调用返回的元素数量相同,结果在结果矩阵的列中,您需要进行转置以获得与设计矩阵对齐的部分:
sapply(gm_glog, function(x){ x[c("deviance", "aic")]})
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
deviance 17.52917 78.81847 239.01 553.7603 29.27955 58.71526 77.9131 147.4969 29.97461 39.20052 40.13341
aic -33.49309 -77.06459 -117.259 -389.6077 2.919589 -21.44068 11.57039 67.40446 -10.87137 31.8441 19.54028
[,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23]
deviance 198.8519 7.185649 38.97136 47.7754 80.16326 6.465192 11.35418 22.99457 83.80098 5.192405 8.945869 39.36833
aic -23.23857 7.554898 -16.0006 27.28793 63.8827 11.50956 43.5854 33.28914 58.52796 26.10081 28.88124 33.08681
[,24] [,25] [,26] [,27] [,28] [,29] [,30] [,31] [,32] [,33] [,34] [,35]
deviance 54.99003 7.045267 14.42835 26.74579 31.64986 1.670572 3.71758 24.23743 47.28533 0.2497075 12.76083 17.40761
aic 72.41119 3.920895 34.28885 24.2481 55.23406 15.1922 28.20926 44.49589 83.13905 11.19624 41.62632 37.05153
[,36] [,37] [,38] [,39] [,40]
deviance 35.25456 12.10367 9.070027 34.15762 29.88891
aic 65.23201 19.17986 34.25908 33.74274 71.36175