制作自定义分位数时如何改变箱线图宽度?

问题描述 投票:0回答:1

我想将箱线图晶须的长度设置为数据的中位数 +/- 1.96*标准差(也称为数据的 95% 分布)。我通过使用聚合计算箱线图统计数据并将其设置为最小值、下四分位数、中位数等来完成此操作。如何设置箱线图宽度以使其与观测值数量的平方根成正比(例如ggplot 是否使用 varwidth = TRUE)?我当前尝试的任何操作(设置权重、宽度)都会同等地改变所有类别的宽度。谢谢你。

rm(list = ls())
library(ggplot2)

set.seed(1)

residuals <- runif(n=1000, min=-3, max=3)
category <- c('A','A','A','B','B','C','D','E','E','F')
df1 <- data.frame(category,residuals)


boxplot_stats <- aggregate(residuals ~ category, df1, function(x) {
  median_val = median(x)
  z_score = 1.96
  min_quantile = median_val - z_score * sd(x)
  lower_quantile = quantile(x, c(0.25))
  upper_quantile = quantile(x, c(0.75))
  max_quantile = median_val + z_score * sd(x)
  n_obs_sqrt = sqrt(length(x))
  c(min_quantile, lower_quantile, median_val, upper_quantile, max_quantile, n_obs_sqrt)
})

custom_boxplot <- ggplot(boxplot_stats, aes(x=category))+
  geom_boxplot(aes(ymin = residuals[, 1], lower = residuals[, 2], middle = residuals[, 3], upper = residuals[, 4], ymax = residuals[, 5]), stat = "identity", color = "black",fill="lightblue") +
  labs(title="boxplot",x="Category",y="Residuals") + 
  theme_bw()
print(custom_boxplot)
r ggplot2 plot width boxplot
1个回答
0
投票

我认为你走在正确的道路上。首先,您需要编译观察数量:

df1 %>% group_by(category) %>% summarise(n=n())

然后将它们放入宽度参数中

custom_boxplot <- ggplot(boxplot_stats, aes(x=category))+
  geom_boxplot(aes(ymin = residuals[, 1], lower = residuals[, 2], middle = residuals[, 3], 
                   upper = residuals[, 4], ymax = residuals[, 5]), 
               stat = "identity", color = "black",fill="lightblue",
               width = sqrt(c(300,200,100,100,200,100))) +
  labs(title="boxplot",x="Category",y="Residuals") + 
  theme_bw()
print(custom_boxplot)

我怀疑有一种比硬编码更具编程性的方法来做到这一点,但希望这能给你一个开始

© www.soinside.com 2019 - 2024. All rights reserved.