在 ggplot2 boxplot 中添加每组和子组的观察值

问题描述 投票:0回答:2

这看起来像是this question的重复,但实际上我想扩展原始问题。

我想用 ggplot 中每组和子组的观察次数来注释箱线图。按照示例或原始帖子,这是我的最小示例:

require(ggplot2)

give.n <- function(x){
  return(c(y = median(x)*1.05, label = length(x))) 
  # experiment with the multiplier to find the perfect position
}

ggplot(mtcars, aes(factor(cyl), mpg, fill = factor(gear))) +
  geom_boxplot() +
  stat_summary(fun.data = give.n, geom = "text", fun.y = median)

我的问题是样本数量全部排在组的中心,而不是绘制在适当的箱线图上(如下图所示):

r ggplot2 boxplot
2个回答
3
投票

是你想要的吗?

require(ggplot2)

give.n <- function(x){
  return(c(y = median(x)*1.05, label = length(x))) 
  # experiment with the multiplier to find the perfect position
}

ggplot(mtcars, aes(factor(cyl), mpg, fill = factor(gear))) +
  geom_boxplot() +
  stat_summary(fun.data = give.n, geom = "text", fun.y = median, position=position_dodge(width=0.75))


0
投票

如果其他人在将文本定位在合适的位置时遇到问题,这里是我对@MLavoie 的答案的修改:

require(ggplot2)

give.n <- function(x){
  
  # Calculate the third quantile (q3) and the distance between the median and
  # q3:
  q3 <- quantile( x, probs = c(0.75), names = F )
  distance_between_median_and_q3 <- ( q3 - median(x))
  
  # If the distance between the median and 3rd quartile are large enough, place
  # text halfway between the median and 3rd quartile:
  if( distance_between_median_and_q3 > 0.8 ){
    return( c( 
      y = median(x) + (q3 - median(x))/2
      , label = length(x) )) 
  } else{
    # If the distance is too small, either:
    
    # 1) place text above upper whisker *as long as*  IQR = 0,
    if(IQR(x) > 0 ){
      upper_whisker <- max( x[ x < (q3 + 1.5 * IQR(x)) ])
      
      return( c( 
        y = upper_whisker * 1.03
        , label = length(x) )) 
    } else{
      # or 
      # 2) place text above median
      return( c( 
        y = median(x) * 1.03
        , label = length(x) )) 
    }
  }
}

ggplot(mtcars, aes(factor(cyl), mpg, fill = factor(gear))) +
  geom_boxplot() +
  stat_summary( fun.data = give.n
                , geom = "text"
                # , fun.y = median
                , position = position_dodge( width = 0.75 ) 
  )

请注意,您可能需要对

give.n
函数中的某些值或代码进行试验,以使其适用于您的数据。但是正如您所看到的,可以使
give.n
非常灵活。

© www.soinside.com 2019 - 2024. All rights reserved.