我正在用一些数据制作一些ggplot图,我已经得到了一些细菌的紫外线诱变的信息。我正在尝试将曲线拟合到第二张图(时间 vs 存活率),我认为它不喜欢这样的事实:由于您没有计算具有 <30 / >300 个菌落的平板,因此只有 3 个数据点。因此,当绘制曲线向最终数据点扭曲时,但实际上,它会继续向 0% 下降趋势
install.packages("tidyverse")
library(tidyverse)
data <- read.csv("uv_mutagenesis.csv")
# Filter <30 / >300
filtered_data <- data %>%
filter(colonies_per_plate >= 30, colonies_per_plate <= 300)
#mean, SD, SEM
mean_data <- filtered_data %>%
group_by(time, dilution) %>%
summarize(mean_colonies = mean(colonies_per_plate),
sd_colonies = sd(colonies_per_plate),
sem_colonies = sd(colonies_per_plate) / sqrt(n()))
# CFU/ml
plated_volume_ml <- 1
mean_data$CFU_per_ml <- mean_data$mean_colonies / (mean_data$dilution * plated_volume_ml)
# Plot CFU/ml vs time
ggplot(mean_data, aes(x = time, y = CFU_per_ml, theme_bw())) +
geom_point() +
geom_errorbar(aes(ymin = CFU_per_ml - sem_colonies / (mean_data$dilution * plated_volume_ml),
ymax = CFU_per_ml + sem_colonies / (mean_data$dilution * plated_volume_ml)),
width = 0.05) +
labs(x = "Time",
y = "CFU/ml")
# Calculate percentage spore survival
unirradiated_data <- mean_data %>%
filter(time == 0) %>%
select(dilution, CFU_per_ml)
percentage_survival_data <- mean_data %>%
left_join(unirradiated_data, by = "dilution") %>%
mutate(percentage_survival = (CFU_per_ml.x / CFU_per_ml.y) * 100) %>%
select(-CFU_per_ml.y) %>%
mutate(percentage_survival = pmin(percentage_survival, 100))
#Plot % survival vs time
ggplot(percentage_survival_data, aes(x = time.x, y = percentage_survival, theme_bw())) +
geom_point() +
geom_smooth() +
labs(x = "Time",
y = "Percentage Survival")
关于如何解决这个问题的任何想法。
structure(list(time = c(0, 0, 0, 0.5, 0.5, 0.5, 1, 1, 1, 2, 2,
2, 4, 4, 4, 8, 8, 8, 16, 16, 16), colonies_per_plate = c(141L,
75L, 120L, 126L, 102L, 113L, 101L, 102L, 121L, 10L, 14L, 12L,
12L, 21L, 14L, 3L, 4L, 4L, 1L, 1L, 1L), dilution = c(1e-09, 1e-09,
1e-09, 1e-09, 1e-09, 1e-09, 1e-09, 1e-09, 1e-09, 1e-09, 1e-09,
1e-09, 1e-08, 1e-08, 1e-08, 1e-08, 1e-08, 1e-08, 1e-06, 1e-06,
1e-06), plated_volume_ml = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L)), class = "data.frame", row.names = c(NA,
-21L))
我将用
lattice
包发布一个“老派”R 答案。通过不将数据折叠成平均值,我们可以看到各个点的所有随机荣耀。需要通过将计数除以稀释值来考虑稀释的变化:
library(lattice)
#make a repetition identifier
df1$rep <- 1:3 # will be recycled
png(); xyplot( colonies_per_plate/dilution ~ time, group=rep,
data=df1,
scales=list(y=list(log="e")))
dev.off()
这种方法的优点:通过不在代表内获取数据来减少对数据的伤害,以适当的生物学方式使用计数和稀释信息,并通过对数图上的直线结果很好地说明指数关系: