我得到了一个数据集,每周记录不同物种(Species)的种子在不同土壤类型(Soil_origin)中的发芽情况。我为每个物种准备了四个复制品,换句话说:对于每个物种 x 土壤类型,我有四个容器,每个容器包含十个种子。我想计算每个物种 x 土壤类型每周的平均发芽率(因此平均值 n=4)。
我不知道如何将我的数据集包含在这个脚本中...
我希望你能帮我解决这个问题!
感谢您的宝贵时间。
### My dataset is too large, therefore here I only included the first 10 rows...
> dput(head(Data_bakken_R, 10))
structure(list(Soil_origin = c("P_c", "P_c", "P_c", "P_c", "P_c",
"P_c", "P_c", "P_c", "P_c", "P_c"), Species = c("C_flava", "C_flava",
"C_flava", "C_flava", "C_flacca", "C_flacca", "C_flacca", "C_flacca",
"C_panicea", "C_panicea"), Pot = c(1, 2, 3, 4, 1, 2, 3, 4, 1,
2), Date = structure(c(1685664000, 1685664000, 1685664000, 1685664000,
1685664000, 1685664000, 1685664000, 1685664000, 1685664000, 1685664000
), tzone = "UTC", class = c("POSIXct", "POSIXt")), Day_since_cooling = c(6,
6, 6, 6, 6, 6, 6, 6, 6, 6), Total_start = c(10, 10, 10, 10, 10,
10, 10, 10, 10, 10), Germinated = c(0, 0, 0, 0, 0, 0, 0, 0, 0,
0)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"
))
### Install packages
install.packages("ggplot2")
library(ggplot2)
install.packages("zoo")
library(zoo)
install.packages('tidyverse')
library(tidyverse)
### Plot per soil type
Data_bakken_R %>%
mutate(Proportion_germination = Germinated/Total_start) %>%
ggplot(aes(x = Day_since_cooling, y = Proportion_germination, color = Soil_origin)) +
geom_line() +
facet_grid(rows = vars(Species)) +
labs(x = "Time (days)", y = "Proportion of germinated seeds") + theme_classic() + ylim (0.0, 1.0) `
可能将
grup_by()
与 week()
一起使用:
Data_bakken_R %>%
mutate(weeknum = week(Date)) %>%
group_by(Soil_origin, Species, Day_since_cooling, weeknum) %>%
summarise(Proportion_germination = sum(Germinated)/sum(Total_start)) %>%
ggplot(aes(x = Day_since_cooling, y = Proportion_germination, color = Soil_origin)) +
geom_line() +
facet_grid(rows = vars(Species)) +
labs(x = "Time (days)", y = "Proportion of germinated seeds") + theme_classic() + ylim (0.0, 1.0)
但是我不知道如何处理
Day_since_cooling
,因为你在问题中没有提到它。这就是为什么我也将这个变量添加到分组中。