我正在使用存在/不存在格式集的物种数据,其中在几天的时间内每天多次采样。
这里是数据的虚拟版本:
dummy = structure(list(Sample = c("A1", "A1", "A1", "A2", "A2", "A2",
"B1", "B1", "B1", "B2", "B2", "B2"), Species = c("snuffles1",
"snuffles2", "snuffles3", "snuffles1", "snuffles2", "snuffles3",
"snuffles1", "snuffles2", "snuffles3", "snuffles1", "snuffles2",
"snuffles3"), Presence = c(1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1
), Day = c("A", "A", "A", "A", "A", "A", "B", "B", "B", "B",
"B", "B")), row.names = c(NA, -12L), class = c("tbl_df", "tbl",
"data.frame"))
ggplot(dummy[which(dummy$Presence>0),], aes(x = Day, y = Species, color = Species)) +
geom_point(alpha=0.5) +
geom_count(aes(size = sum(dummy$Presence)))
我想在ggplot中绘制数据,其中每个点的大小取决于该组内观察次数的总和(即,如果在A天,snuffles1被观察2次,则该点应为大小2 ,而如果在B天,发现snuffles1一次,则该点将为大小1)。我希望这是有道理的?这个counting presence/absence based on group很相似,但不是我所需要的。
我的猜测是,我必须使用某种函数来计算每种物种的观测数,这取决于我正在考虑的变量,但是我不够聪明,无法思考如何做到这一点。
感谢您提供所有建议。
按组进行其他计数。然后使用geom_point
我在scale_size
中添加中断以仅显示现有大小
library(tidyverse)
count_dum <- dummy %>% group_by(Day, Species) %>% summarise(count = sum(Presence))
ggplot(dummy[which(dummy$Presence > 0), ], aes(x = Day, y = Species, color = Species)) +
geom_point(data = count_dum, aes(size = count), alpha = 0.5) +
scale_size_continuous(breaks = unique(count_dum$count))