计算另一个分类变量中两个变量的置信区间的平均差

Question

请，我想计算两个变量在另一个分类变量上的置信区间的平均差异。

我有兴趣计算p1，p2和pdiff的置信区间

非常感谢

library(tidyverse)

iris %>% 
  mutate(out1 = Sepal.Length < 6,
         out2 = Sepal.Length < 5) %>% 
  group_by(Species) %>%
  summarise(p1 = mean(out1),
            p2 = mean(out2),
            pdiff = p1 - p2)

# A tibble: 3 x 4
  Species       p1    p2 pdiff
  <fct>      <dbl> <dbl> <dbl>
1 setosa      1     0.4   0.6 
2 versicolor  0.52  0.02  0.5 
3 virginica   0.14  0.02  0.12

Answer 1

获得置信区间的一种方法是通过prop.test。您可以为每个指标（p1，p2，diff）运行此测试，然后使用map提取所需的信息。

library(tidyverse)

iris %>% 
  mutate(out1 = Sepal.Length < 6,
         out2 = Sepal.Length < 5) %>% 
  group_by(Species) %>%
  summarise(p1 = mean(out1),
            p2 = mean(out2),
            pdiff = p1 - p2,
            p1_test = list(prop.test(sum(out1), length(out1))),  # create tests for p1, p2 and diff and save the outputs as list
            p2_test = list(prop.test(sum(out2), length(out2))),
            pdiff_test = list(prop.test(c(sum(out1),sum(out2)), c(length(out1),length(out2)))),
            p1_low = map_dbl(p1_test, ~.$conf.int[1]),     # extract low and high confidence intervals based on the corresponding test
            p1_high = map_dbl(p1_test, ~.$conf.int[2]),
            p2_low = map_dbl(p2_test, ~.$conf.int[1]),
            p2_high = map_dbl(p2_test, ~.$conf.int[2]),
            pdiff_low = map_dbl(pdiff_test, ~.$conf.int[1]),
            pdiff_high = map_dbl(pdiff_test, ~.$conf.int[2])) %>%
  select(-matches("test"))                                         # remove test columns


# # A tibble: 3 x 10
#    Species       p1    p2 pdiff p1_low p1_high  p2_low p2_high pdiff_low pdiff_high
#    <fct>      <dbl> <dbl> <dbl>  <dbl>   <dbl> <dbl>  <dbl>      <dbl>      <dbl>
# 1 setosa      1     0.4   0.6  0.911    1     0.267     0.548   0.444        0.756
# 2 versicolor  0.52  0.02  0.5  0.376    0.661 0.00104   0.120   0.336        0.664
# 3 virginica   0.14  0.02  0.12 0.0628   0.274 0.00104   0.120  -0.00371      0.244

计算另一个分类变量中两个变量的置信区间的平均差

问题描述投票：2回答：1

1个回答

最新问题

计算另一个分类变量中两个变量的置信区间的平均差

问题描述 投票：2回答：1

1个回答

最新问题

问题描述投票：2回答：1