我正在努力将错误栏放在堆叠栏上的正确位置。正如我在之前的一篇文章中读到的那样,我使用 ddply 来堆叠错误栏。然后这改变了堆叠的顺序所以我订购了这个因素。现在看来误差线在一组条上是正确的,但在另一组上不正确。我想要的是一个看起来像下面的图表,只是用错误栏显示标准错误。我列出了原始数据的 dput 和 ddply 数据以及数据集。
Suz2$org <- factor(Suz2$org, levels = c('fungi','bacteria'),ordered = TRUE)
library(plyr)
plydat <- ddply(Suz2,.(org, group, time),transform,ybegin = copy - se,yend = copy + se)
colvec <-c("blue", "orange")
ggplot(plydat, aes(time, copy)) +
geom_bar(aes(fill = factor(org)), stat="identity", width = 0.7) +
scale_fill_manual(values = colvec) +
facet_wrap(~group,nrow = 1)+
geom_errorbar(aes(ymax=ybegin , ymin= yend ),width=.5) +
theme(panel.background = element_rect(fill='white', colour='white'),
panel.grid = element_line(color = NA),
panel.grid.minor = element_line(color = NA),
panel.border = element_rect(fill = NA, color = "black"),
axis.text.x = element_text(size=10, colour="black", face = "bold"),
axis.title.x = element_text(vjust=0.1, face = "bold"),
axis.text.y = element_text(size=12, colour="black"),
axis.title.y = element_text(vjust=0.2, size = 12, face = "bold"))
plydat
plydat <- data.frame(
org = ordered(rep(c("fungi", "bacteria"), each = 8L), levels = c("fungi", "bacteria")),
time = factor(rep(rep(c("0W", "6W"), 2), each = 4L)),
copy = c(
97800000, 15500000, 40200000, 10400000, 55100000, 14300000, 1.6e+07, 8640000,
2.98e+08, 77900000, 2.33e+08, 2.2e+08, 3.37e+08, 88400000, 3.24e+08, 1.89e+08
),
group = factor(rep(c("Notill D0", "Notill D707", "Native D0", "Native D707"), 4)),
se = c(
11100000, 2810000, 7110000, 2910000, 1.7e+07, 1500000, 1930000, 2980000,
43900000, 20100000, 56400000, 41200000, 75700000, 22500000, 57500000,
28100000
),
ybegin = c(
86700000, 12690000, 33090000, 7490000, 38100000, 12800000, 14070000, 5660000,
254100000, 57800000, 176600000, 178800000, 261300000, 65900000, 266500000,
160900000
),
yend = c(
108900000, 18310000, 47310000, 13310000, 72100000, 15800000, 17930000,
11620000, 341900000, 9.8e+07, 289400000, 261200000, 412700000, 110900000,
381500000, 217100000
)
)
Suz2
Suz2 <- data.frame(
org = ordered(rep(c("fungi", "bacteria"), each = 8L), levels = c("fungi", "bacteria")),
time = factor(rep(rep(c("0W", "6W"), 2), each = 4L)),
copy = c(
97800000, 15500000, 40200000, 10400000, 55100000, 14300000, 1.6e+07, 8640000,
2.98e+08, 77900000, 2.33e+08, 2.2e+08, 3.37e+08, 88400000, 3.24e+08, 1.89e+08
),
group = factor(rep(c("Notill D0", "Notill D707", "Native D0", "Native D707"), 4)),
se = c(
11100000, 2810000, 7110000, 2910000, 1.7e+07, 1500000, 1930000, 2980000,
43900000, 20100000, 56400000, 41200000, 75700000, 22500000, 57500000,
28100000
)
)
Suz2
org time copy group se
1 fungi 0W 9.78e+07 Notill D0 11100000
2 fungi 0W 1.55e+07 Notill D707 2810000
3 fungi 0W 4.02e+07 Native D0 7110000
4 fungi 0W 1.04e+07 Native D707 2910000
5 fungi 6W 5.51e+07 Notill D0 17000000
6 fungi 6W 1.43e+07 Notill D707 1500000
7 fungi 6W 1.60e+07 Native D0 1930000
8 fungi 6W 8.64e+06 Native D707 2980000
9 bacteria 0W 2.98e+08 Notill D0 43900000
10 bacteria 0W 7.79e+07 Notill D707 20100000
11 bacteria 0W 2.33e+08 Native D0 56400000
12 bacteria 0W 2.20e+08 Native D707 41200000
13 bacteria 6W 3.37e+08 Notill D0 75700000
14 bacteria 6W 8.84e+07 Notill D707 22500000
15 bacteria 6W 3.24e+08 Native D0 57500000
16 bacteria 6W 1.89e+08 Native D707 28100000
ybegin
和 yend
的值(误差线的范围)对于 bacteria
数据来说太低了。由于 bacteria
的条在 fungi
条的顶部,因此必须将 fungi
条 (plydat$copy[plydat$org == "fungi"]
) 的高度添加到 bacteria
数据的误差条值中。
plydat[plydat$org == "bacteria", ]
<- transform(plydat[plydat$org == "bacteria", ],
ybegin = ybegin + plydat[plydat$org == "fungi", "copy"],
yend = yend + plydat[plydat$org == "fungi", "copy"])
就我个人而言,我不太喜欢堆叠条形图,尤其是当堆叠条形图的数量很大时(你不是这种情况)。主要问题是除了最低堆栈之外的所有堆栈都不共享相同的基线。在您的情况下,很难比较橙色
bacteria
类,因为它们不共享相同的基数(y 值,copy
)。
我建议使用一个称为点图的图:
library(ggplot2)
theme_set(theme_bw())
ggplot(plydat, aes(time, copy, color = org)) +
geom_point() + facet_wrap(~group, ncol = 1) +
geom_errorbar(aes(ymax=ybegin , ymin= yend), width = 0) + coord_flip()
请注意,这里的
copy
值不像在堆叠条形图中那样是相加的。因为它们共享相同的基 copy
值 (0),所以您可以轻松地比较不同的 bacteria
值。此外,我交换了 x 轴和 y 轴,以便比较 copy
的值(只需删除 coord_flip
以查看在比较 copy
时效果如何)。
唯一真正的缺点是没有简单的方法来判断
fungi
和bacteria
的总和。根据图表要显示的内容(图表的故事),这可能是问题,也可能不是问题。您可以向 org
添加一个单独的附加类别,即 both
是两个类别的总和,以解决此问题。当然,解释这个总和类别中的错误是非常重要的。
综合以上答案,我想我会选择这样的东西。
plydat <- ddply(Suz2,.(org),transform,ybegin = copy - se,yend = copy + se)
colvec <-c("blue", "orange")
ggplot(plydat, aes(time, copy, color = factor(org))) +
geom_point(size = 3.5) + facet_wrap(~group, ncol = 4) +
scale_color_manual(values = colvec) +
geom_errorbar(aes(ymax=ybegin , ymin= yend), width = 0.08,
color = "black", size = 0.1) +
theme(panel.background = element_rect(fill='white', colour='white'),
panel.grid = element_line(color = NA),
panel.grid.minor = element_line(color = NA),
panel.border = element_rect(fill = NA, color = "black"),
strip.background = element_blank(),
axis.text.x = element_text(size=10, colour="black", face = "bold"),
axis.title.x = element_text(vjust=0.1, face = "bold"),
axis.text.y = element_text(size=12, colour="black"),
axis.title.y = element_text(vjust=0.2, size = 12, face = "bold"))