努力在ggplot中绘制曲线

问题描述 投票:0回答:1

我正在用一些数据制作一些ggplot图,我已经得到了一些细菌的紫外线诱变的信息。我正在尝试将曲线拟合到第二张图(时间 vs 存活率),我认为它不喜欢这样的事实:由于您没有计算具有 <30 / >300 个菌落的平板,因此只有 3 个数据点。因此,当绘制曲线向最终数据点扭曲时,但实际上,它会继续向 0% 下降趋势

install.packages("tidyverse")
library(tidyverse)

data <- read.csv("uv_mutagenesis.csv")

# Filter <30 / >300
filtered_data <- data %>%
  filter(colonies_per_plate >= 30, colonies_per_plate <= 300)


#mean, SD, SEM
mean_data <- filtered_data %>%
  group_by(time, dilution) %>%
  summarize(mean_colonies = mean(colonies_per_plate),
            sd_colonies = sd(colonies_per_plate),
            sem_colonies = sd(colonies_per_plate) / sqrt(n()))

# CFU/ml
plated_volume_ml <- 1
mean_data$CFU_per_ml <- mean_data$mean_colonies / (mean_data$dilution * plated_volume_ml)

# Plot CFU/ml vs time
ggplot(mean_data, aes(x = time, y = CFU_per_ml, theme_bw())) +
  geom_point() +
  geom_errorbar(aes(ymin = CFU_per_ml - sem_colonies / (mean_data$dilution * plated_volume_ml),
                    ymax = CFU_per_ml + sem_colonies / (mean_data$dilution * plated_volume_ml)),
                width = 0.05) + 
  labs(x = "Time",
       y = "CFU/ml")


# Calculate percentage spore survival
unirradiated_data <- mean_data %>%
  filter(time == 0) %>%
  select(dilution, CFU_per_ml)


percentage_survival_data <- mean_data %>%
  left_join(unirradiated_data, by = "dilution") %>%
  mutate(percentage_survival = (CFU_per_ml.x / CFU_per_ml.y) * 100) %>%
  select(-CFU_per_ml.y) %>%
  mutate(percentage_survival = pmin(percentage_survival, 100))

#Plot % survival vs time 

ggplot(percentage_survival_data, aes(x = time.x, y = percentage_survival, theme_bw())) +
  geom_point() +
  geom_smooth() +
  labs(x = "Time",
       y = "Percentage Survival")

应该看起来更像这里的曲线:

关于如何解决这个问题的任何想法。

structure(list(time = c(0, 0, 0, 0.5, 0.5, 0.5, 1, 1, 1, 2, 2, 
2, 4, 4, 4, 8, 8, 8, 16, 16, 16), colonies_per_plate = c(141L, 
75L, 120L, 126L, 102L, 113L, 101L, 102L, 121L, 10L, 14L, 12L, 
12L, 21L, 14L, 3L, 4L, 4L, 1L, 1L, 1L), dilution = c(1e-09, 1e-09, 
1e-09, 1e-09, 1e-09, 1e-09, 1e-09, 1e-09, 1e-09, 1e-09, 1e-09, 
1e-09, 1e-08, 1e-08, 1e-08, 1e-08, 1e-08, 1e-08, 1e-06, 1e-06, 
1e-06), plated_volume_ml = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L)), class = "data.frame", row.names = c(NA, 
-21L))
r ggplot2 bioinformatics
1个回答
0
投票

我将用

lattice
包发布一个“老派”R 答案。通过不将数据折叠成平均值,我们可以看到各个点的所有随机荣耀。需要通过将计数除以稀释值来考虑稀释的变化:

library(lattice)
#make a repetition identifier
df1$rep <- 1:3  # will be recycled
png(); xyplot( colonies_per_plate/dilution ~ time, group=rep, 
               data=df1, 
               scales=list(y=list(log="e")))
dev.off()

这种方法的优点:通过不在代表内获取数据来减少对数据的伤害,以适当的生物学方式使用计数和稀释信息,并通过对数图上的直线结果很好地说明指数关系:

© www.soinside.com 2019 - 2024. All rights reserved.