考虑这个最小的示例数据框:
df <- data.frame(lab1 = c(rep("no", 10), rep("yes", 20)),
var1 = c(3,6,3,3,3,4,5,6,3,6,2,3,4,3,2,3,9,9,8,7,6,7,8,9,9,8,7,6,5,1)
)
由此,我们可以轻松地绘制直方图,例如:
p <- ggplot(df,
aes_string(x = 'var1', fill = 'lab1')) +
geom_histogram(position = 'dodge', bins = 20)
[我现在想做的是在其顶部添加一个折线图,以指示每个垃圾箱(与直方图的垃圾箱大小相同)的“否”计数的百分比(100 *否/(是+否)) 。然后,该百分比应显示在辅助轴上。
有什么方法可以做到这一点?
library(dplyr)
df_sum <- df %>%
group_by(var1) %>%
summarize(no_pct = 100 * sum(lab1 == "no") / n())
p <- ggplot(df,
aes(x = var1, fill = lab1)) +
geom_histogram(position = 'dodge', bins = 20) +
geom_line(data = df_sum, aes(var1, no_pct / 10), inherit.aes = F) +
scale_y_continuous(sec.axis = ~ . * 10)
p
编辑:添加了替代装箱
您可能考虑在ggplot的上游进行装箱,以使其更容易计算它们的摘要统计信息:
library(dplyr)
binwidth = 1
df_bin <- df %>%
count(var1 = floor(var1/binwidth)*binwidth, lab1)
df_sum <- df_bin %>%
group_by(var1) %>%
summarize(no_pct = 100 * sum(n * (lab1 == "no")) / sum(n))
ggplot() +
geom_col(data = df_bin, aes(var1, n, fill = lab1),
position = position_dodge(preserve = "single")) +
geom_line(data = df_sum, aes(var1, no_pct / 10)) +
scale_y_continuous(sec.axis = ~ . * 10)