来自同一变量的两个直方图的比例

问题描述 投票:0回答:1

考虑这个最小的示例数据框:

df <- data.frame(lab1 = c(rep("no", 10), rep("yes", 20)),
                 var1 = c(3,6,3,3,3,4,5,6,3,6,2,3,4,3,2,3,9,9,8,7,6,7,8,9,9,8,7,6,5,1)
                )

由此,我们可以轻松地绘制直方图,例如:

 p <- ggplot(df,
             aes_string(x = 'var1', fill = 'lab1')) + 
             geom_histogram(position = 'dodge', bins = 20)

[我现在想做的是在其顶部添加一个折线图,以指示每个垃圾箱(与直方图的垃圾箱大小相同)的“否”计数的百分比(100 *否/(是+否)) 。然后,该百分比应显示在辅助轴上。

有什么方法可以做到这一点?

r ggplot2
1个回答
1
投票
library(dplyr)
df_sum <- df %>%
  group_by(var1) %>%
  summarize(no_pct = 100 * sum(lab1 == "no") / n())

p <- ggplot(df,
            aes(x = var1, fill = lab1)) +
  geom_histogram(position = 'dodge', bins = 20) +
  geom_line(data = df_sum, aes(var1, no_pct / 10), inherit.aes = F) +
  scale_y_continuous(sec.axis = ~ . * 10)
p

enter image description here

编辑:添加了替代装箱

您可能考虑在ggplot的上游进行装箱,以使其更容易计算它们的摘要统计信息:

library(dplyr)
binwidth = 1

df_bin <- df %>%
  count(var1 = floor(var1/binwidth)*binwidth, lab1)
df_sum <- df_bin %>%
  group_by(var1) %>%
  summarize(no_pct = 100 * sum(n * (lab1 == "no")) / sum(n))

ggplot() +
  geom_col(data = df_bin, aes(var1, n, fill = lab1),
           position = position_dodge(preserve = "single")) +
  geom_line(data = df_sum, aes(var1, no_pct / 10)) +
  scale_y_continuous(sec.axis = ~ . * 10)
© www.soinside.com 2019 - 2024. All rights reserved.