我在访问列中的特定类时遇到了问题。我的数据框架如下。
library(ggplot2)
library(dplyr)
dat <- data.frame(
time = factor(c("Breakfast","Breakfast","Breakfast","Breakfast","Breakfast","Lunch","Lunch","Lunch","Lunch","Lunch","Lunch","Dinner","Dinner","Dinner","Dinner","Dinner","Dinner","Dinner"), levels=c("Breakfast","Lunch","Dinner")),
class = c("a","a","b","b","c","a","b","b","c","c","c","a","a","b","b","b","c","c"))
在列中 time
我只对检测感兴趣 Breakfast
和 Dinner
为类 a
, b
和 c
因此,从该数据框架中,我只想在表格中查看它,它将像这样。
a b c
Breakfast 2 2 1
Dinner 2 3 2
所以对于每个类 a
,b
,c
我想画两根柱子。例如类 a
一条代表。平均值 Breakfast
与其他班级相比:2(2+2+1)和一个其他的条形代表了 Dinner
与其他类比较:2(2+3+2),并将它们设置为不同的颜色。我想对班级进行同样的设置 b
和类 c
.
任何帮助将非常感激。
我们可以 subset
和 table
丢下 levels
与 droplevels
table(droplevels(subset(dat, time %in% c("Breakfast", "Dinner"))))
# class
#time a b c
# Breakfast 2 2 1
# Dinner 2 3 2
如果我们需要一个 barplot
barplot(prop.table(table(droplevels(subset(dat, time %in%
c("Breakfast", "Dinner")))), 1), beside = TRUE)
或与 ggplot
library(dplyr)
library(ggplot2)
dat %>%
filter(time %in% c("Breakfast", "Dinner")) %>%
droplevels %>%
count(time, class) %>%
group_by(time) %>%
mutate(prop = n/sum(n)) %>%
ggplot(aes(x = class, y = prop, fill = time, label = scales::percent(prop))) +
geom_col(position = 'dodge') +
geom_text(position = position_dodge(width = 0.9), vjust = 0.5, size = 3) +
scale_y_continuous(labels = scales::percent)