计算堆叠条形图的标准差

问题描述 投票:0回答:1

我想计算标准差和标准误差,以便在叠加条形图上显示误差条。

 Management    Habitat   Intensity     Var2   
   A           Urban        High        6   
   A          Farmland      High        9   
   A          Farmland      Medium     10 
   B          Forest        Medium     17 
   B          Peatland      Medium     23     
   C          Peatland      Low        22    
   C          Urban         Low        10     

我的堆叠条形图的代码是

 ggplot(df, aes(fill=Habitat, y= Var1, x=Intensity)) + 
  geom_bar(position="stack", stat="identity")+
  labs(y = "Area of habitat (hectares)")+
  theme(legend.title = element_text())

我试着用ddply函数计算Var 2的标准差和标准误差,通过强度给出每个条形图的总体误差,然后设置ymin和ymax的限制,但我得到的是一个错误。

错误。美学必须长度为1或与数据相同(96):ymax和ymin。

EB<-ddply(Mean_PFB, c("Intensity"), summarise,
      N    = length(Var2),
      mean = mean(Var2),
      sd   = sd(Var2),
      se   = sd / sqrt(N))
r standard-deviation standard-error
1个回答
0
投票

这是你的完整数据集吗?那么它是不可能计算标准差或标准误差,因为你没有适当的复制。请看下面

library(tidyverse)
#> Warning: package 'tidyr' was built under R version 3.6.2
#> Warning: package 'dplyr' was built under R version 3.6.2

df <- read.table(text = "Management    Habitat   Intensity     Var2   
           A          Urban         High        6   
           A          Farmland      High        9   
           A          Farmland      Medium     10 
           B          Forest        Medium     17 
           B          Peatland      Medium     23     
           C          Peatland      Low        22    
           C          Urban         Low        10", header=T)

#standard deviation calculation
df %>% 
  group_by(Habitat) %>% 
  summarise(new = list(mean_sdl(Var2))) %>% 
  unnest(new)
#> # A tibble: 4 x 4
#>   Habitat      y   ymin  ymax
#>   <fct>    <dbl>  <dbl> <dbl>
#> 1 Farmland   9.5   8.09  10.9
#> 2 Forest    17   NaN    NaN  
#> 3 Peatland  22.5  21.1   23.9
#> 4 Urban      8     2.34  13.7

df %>% 
  group_by(Management) %>% 
  summarise(new = list(mean_sdl(Var2))) %>% 
  unnest(new)
#> # A tibble: 3 x 4
#>   Management     y   ymin  ymax
#>   <fct>      <dbl>  <dbl> <dbl>
#> 1 A           8.33  4.17   12.5
#> 2 B          20    11.5    28.5
#> 3 C          16    -0.971  33.0

df %>% 
  group_by(Intensity) %>% 
  summarise(new = list(mean_sdl(Var2))) %>% 
  unnest(new)
#> # A tibble: 3 x 4
#>   Intensity     y   ymin  ymax
#>   <fct>     <dbl>  <dbl> <dbl>
#> 1 High        7.5  3.26   11.7
#> 2 Low        16   -0.971  33.0
#> 3 Medium     16.7  3.65   29.7

#standard deviation calculation for grouped data with Intensity, Habitat 
#give you NAs as it does not have proper replications
df %>% 
  group_by(Intensity, Habitat) %>% 
  summarise(new = list(mean_sdl(Var2))) %>% 
  unnest(new)
#> # A tibble: 7 x 5
#> # Groups:   Intensity [3]
#>   Intensity Habitat      y  ymin  ymax
#>   <fct>     <fct>    <dbl> <dbl> <dbl>
#> 1 High      Farmland     9   NaN   NaN
#> 2 High      Urban        6   NaN   NaN
#> 3 Low       Peatland    22   NaN   NaN
#> 4 Low       Urban       10   NaN   NaN
#> 5 Medium    Farmland    10   NaN   NaN
#> 6 Medium    Forest      17   NaN   NaN
#> 7 Medium    Peatland    23   NaN   NaN

同样适用于标准误差,只需使用 mean_se 代替 mean_sdl

创建于2020-04-27 重读包 (v0.3.0)

© www.soinside.com 2019 - 2024. All rights reserved.