小间距的置信区间[重复]

问题描述 投票:1回答:2

我有一个小标题,下面显示了一个示例。它具有七个预测变量(V4V10)和九个结果(w1w2w3mwi1i2i3mi,[ C0])。我正在尝试为第2列(p2)至第10列(w1

的结果创建置信区间
p2

[当我在vars w1 w2 w3 mw i1 i2 i3 mi p2 V4 0.084 0.017 0.061 0.054 22.800 4.570 16.700 14.700 0.367 V5 0.032 0.085 0.039 0.052 8.840 23.100 10.700 14.200 0.367 V6 0.026 0.066 0.022 0.038 7.030 18.000 6.070 10.400 0.367 V7 0.097 0.020 0.066 0.061 26.300 5.420 18.100 16.600 0.367 V8 0.048 0.071 0.043 0.054 13.100 19.300 11.800 14.700 0.367 V9 0.018 0.111 0.020 0.050 4.800 30.300 5.440 13.500 0.367 V10 0.053 0.020 0.103 0.058 14.300 5.330 28.000 15.900 0.367 V4 0.084 0.017 0.060 0.054 22.400 4.420 16.200 14.300 0.373 V5 0.032 0.072 0.036 0.047 8.630 19.300 9.760 12.500 0.373 V6 0.030 0.076 0.023 0.043 8.080 20.500 6.070 11.500 0.373 V7 0.080 0.021 0.087 0.063 21.500 5.720 23.300 16.800 0.373 V8 0.053 0.090 0.034 0.059 14.100 24.000 9.110 15.700 0.373 V9 0.016 0.101 0.025 0.048 4.410 27.100 6.790 12.800 0.373 V10 0.060 0.022 0.100 0.061 16.000 5.950 26.800 16.300 0.373 中使用group_by变量(vars)并对三个结果运行分位数(作为测试)时,它并不能满足我的需求。它没有给我三个结果的置信区间,而是给了我一个置信区间,因为如下所示:

dplyr

简而言之,我正在寻找的是类似下表的表格,在该表格中,我获得了每个结果的置信区间。

+   group_by(vars) %>% 
+   do(data.frame(t(quantile(c(.$w1, .$w2, .$w3), probs = c(0.025, 0.975)))))
# A tibble: 7 x 3
# Groups:   variables [7]
  variables  X2.5 X97.5
1 V10       0.0202 0.103 
2 V4        0.017  0.084 
3 V5        0.032  0.0834
4 V6        0.0221 0.0748
5 V7        0.0201 0.0958
6 V8        0.0351 0.0876
7 V9        0.0162 0.110 

任何朝正确方向的指针将不胜感激。我已经阅读过StackOverflow,但似乎找不到解决我想要做的问题的答案。

r dplyr confidence-interval tibble
2个回答
1
投票

这里有两种方法。

Base R。

         w1                w2                    w3 
vars X2.5   X97.5   vars  X2.5  X97.5  vars X2.5    X97.5
V10 0.020   0.103   V10 0.020   0.103   V10 0.020   0.103
V4  0.017   0.084   V4  0.017   0.084   V4  0.017   0.084
V5  0.032   0.083   V5  0.032   0.083   V5  0.032   0.083
V6  0.022   0.075   V6  0.022   0.075   V6  0.022   0.075
V7  0.020   0.096   V7  0.020   0.096   V7  0.020   0.096
V8  0.035   0.088   V8  0.035   0.088   V8  0.035   0.088
V9  0.016   0.110   V9  0.016   0.110   V9  0.016   0.110

使用aggregate(df1[-1], list(df1[[1]]), quantile, probs = c(0.025, 0.975))

tidyverse

注意,在第二种方式中,输出格式是不同的,第一个分位数(library(dplyr) df1 %>% group_by(vars) %>% mutate_at(vars(w1:p2), quantile, probs = c(0.025, 0.975)) )在第一行中,第二个分位数(0.025)在最后一行中。

数据。

0.975

0
投票

另一种可能性:融化/旋转为长格式;计算摘要;然后投射/旋转为宽格式

df1 <-
structure(list(vars = structure(c(2L, 3L, 4L, 
5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L), 
.Label = c("V10", "V4", "V5", "V6", "V7", "V8", 
"V9"), class = "factor"), w1 = c(0.084, 0.032, 
0.026, 0.097, 0.048, 0.018, 0.053, 0.084, 
0.032, 0.03, 0.08, 0.053, 0.016, 0.06), 
w2 = c(0.017, 0.085, 0.066, 0.02, 0.071, 0.111, 
0.02, 0.017, 0.072, 0.076, 0.021, 0.09, 0.101, 
0.022), w3 = c(0.061, 0.039, 0.022, 0.066, 
0.043, 0.02, 0.103, 0.06, 0.036, 0.023, 0.087, 
0.034, 0.025, 0.1), mw = c(0.054, 0.052, 0.038, 
0.061, 0.054, 0.05, 0.058, 0.054, 0.047, 0.043, 
0.063, 0.059, 0.048, 0.061), i1 = c(22.8, 8.84, 
7.03, 26.3, 13.1, 4.8, 14.3, 22.4, 8.63, 8.08, 
21.5, 14.1, 4.41, 16), i2 = c(4.57, 23.1, 18, 5.42, 
19.3, 30.3, 5.33, 4.42, 19.3, 20.5, 5.72, 24, 27.1, 
5.95), i3 = c(16.7, 10.7, 6.07, 18.1, 11.8, 5.44, 
28, 16.2, 9.76, 6.07, 23.3, 9.11, 6.79, 26.8), 
mi = c(14.7, 14.2, 10.4, 16.6, 14.7, 13.5, 15.9, 
14.3, 12.5, 11.5, 16.8, 15.7, 12.8, 16.3), 
p2 = c(0.367, 0.367, 0.367, 0.367, 0.367, 0.367, 
0.367, 0.373, 0.373, 0.373, 0.373, 0.373, 0.373, 
0.373)), class = "data.frame", 
row.names = c(NA, -14L))

不幸的是,这些列的排列顺序不理想;我想不出[[quick修复(您可以按想要的顺序library(tidyverse) df2 <- (df1 %>% pivot_longer(-vars,"outcome","value") %>% group_by(vars,outcome) %>% summarise(lwr=quantile(value,0.025),upr=quantile(value,0.975)) ) df2 %>% pivot_wider(names_from=outcome,values_from=c(lwr,upr)) 使用变量...

© www.soinside.com 2019 - 2024. All rights reserved.