堆积条形图在ggplot中具有不同的宽度

问题描述 投票:3回答:1

我尝试构建一个宽度不同的堆积条形图,这样宽度表示分配的平均数量,而高度表示分配数量。

接下来,您将找到我可重复的数据:

procedure = c("method1","method2", "method3", "method4","method1","method2", "method3", "method4","method1","method2", "method3","method4")
sector =c("construction","construction","construction","construction","delivery","delivery","delivery","delivery","service","service","service","service") 
number = c(100,20,10,80,75,80,50,20,20,25,10,4)
amount_mean = c(1,1.2,0.2,0.5,1.3,0.8,1.5,1,0.8,0.6,0.2,0.9) 

data0 = data.frame(procedure, sector, number, amount_mean)

当使用geom_bar并在aes中包含宽度时,我收到以下错误消息:

position_stack requires non-overlapping x intervals. Furthermore, the bars are no longer stacked. 
bar<-ggplot(data=data0,aes(x=sector,y=number,fill=procedure, width = amount_mean)) + 
geom_bar(stat="identity") 

我也查看了mekko-package,但似乎这仅适用于条形图。

这是我最终想要的(不是基于以上数据):

知道怎么解决我的问题吗?

r ggplot2
1个回答
3
投票

我也尝试了同样的,geom_col(),但我遇到了同样的问题 - 使用position = "stack"似乎我们无法分配width参数而不进行拆卸。

但事实证明,这个解决方案非常简单 - 我们可以使用geom_rect()“手工制作”这样的情节。

有你的数据:

df = data.frame(
  procedure   = rep(paste("method", 1:4), times = 3),
  sector      = rep(c("construction", "delivery", "service"), each = 4),
  amount      = c(100, 20, 10, 80, 75, 80, 50, 20, 20, 25, 10, 4),
  amount_mean = c(1, 1.2, 0.2, 0.5, 1.3, 0.8, 1.5, 1, 0.8, 0.6, 0.2, 0.9)
)

起初我已经转换了你的数据集:

df <- df %>%
  mutate(amount_mean = amount_mean/max(amount_mean),
         sector_num = as.numeric(sector)) %>%
  arrange(desc(amount_mean)) %>%
  group_by(sector) %>%
  mutate(
    xmin = sector_num - amount_mean / 2,
    xmax = sector_num + amount_mean /2,
    ymin = cumsum(lag(amount, default = 0)), 
    ymax = cumsum(amount)) %>%
  ungroup()

我在这做什么

  1. 我缩小了amount_mean,所以0 >= amount_mean <= 1(更好的绘图,无论如何我们没有另一个比例来显示amount_mean的真实价值);
  2. 我还将sector变量解码为数字(用于绘图,见下文);
  3. 我按amount_mean的顺序排列了数据集(重要的意思是 - 在底部,光线意味着在顶部);
  4. 按部门分组,我计算xminxmax代表amount_meanyminymax数量。前两个有点棘手。 ymax是显而易见的 - 你只需要从第一个开始为所有amount累积一笔钱。你需要累积和来计算ymin,但是从0开始。所以第一个矩形用ymin = 0绘制,第二个用ymin = ymax的previouse triangle等等。所有这些都是用每个单独的sectors组进行的。

绘制数据:

df %>%
  ggplot(aes(xmin = xmin, xmax = xmax,
             ymin = ymin, ymax = ymax, 
             fill = procedure
             )
         ) +
  geom_rect() +
  scale_x_continuous(breaks = df$sector_num, labels = df$sector) +
  #ggthemes::theme_tufte() +
  theme_bw() +
  labs(title = "Question 51136471", x = "Sector", y = "Amount") +
  theme(
    axis.ticks.x = element_blank()
    )

结果:

pyramid_plot

防止procedure变量重新排序的另一个选项。因此,所有人都说“红色”是下降,“绿色”在上面等等。但它看起来很难看:

df <- df %>%
  mutate(amount_mean = amount_mean/max(amount_mean),
         sector_num = as.numeric(sector)) %>%
  arrange(procedure, desc(amount), desc(amount_mean)) %>%
  group_by(sector) %>%
  mutate(
    xmin = sector_num - amount_mean / 2,
    xmax = sector_num + amount_mean /2,
    ymin = cumsum(lag(amount, default = 0)), 
    ymax = cumsum(amount)
    ) %>%
  ungroup()

pyramid_plot_ugly

© www.soinside.com 2019 - 2024. All rights reserved.