我正在尝试计算四组的平均值。我的数据框看起来类似于以下内容:
Sex <- c("F", "F", "M", "M", "F")
Phenotype <- c(Control, Experimental, Experimental, Control, Control)
MOp_Amygdala <- c("10", "15", "2", "6", "8")
MOp_Thalamus <- c("19", "12", "4", "4", "6")
MOp_Cerebellum <- c("34", "45", "67", "78", "99")
MOq_Cortex <- c("2", "5", "6", "17", "2")
MOq_Striatum <- c("100", "101", "102", "106", "200")
df <- data.frame(Sex, Phenotype, MOp_Amygdala, MOp_Thalamus, MOp_Cerebellum, MOq_Cortex, MOq_Striatum)
我想找到我的四组杏仁核、丘脑和小脑的平均值:M-对照、M-实验、F-对照和 F-实验。
这是我到目前为止所尝试过的:
Q1 <- data %>%
group_by(Sex, Phenotype)%>%
select(starts_with("MOp")) %>%
rowwise() %>%
mutate(Group_Means = mean(c(MOp_Amygdala, MOp_Thalamus, MOp_Cerebellum))) #redundant
我的输出主要问题是 group_by 似乎不起作用。我最终得到了 5 个观察值,每个样本一个观察值,而不是 4 个观察值(M-对照、M-实验、F-对照和 F-实验)。
您可以通过汇总数据框来计算每组的
mean
。
我修改了您的输入数据:
Sex <- rep(c("F", "F", "M", "M", "F"), 5)
Phenotype <- rep(c('Control', 'Experimental', 'Experimental', 'Control', 'Control'), 5)
MOp_Amygdala <- c(10, 15, 2, 6, 8, sample(seq(1,20,1), 20, replace = TRUE))
MOp_Thalamus <- c(19, 12, 4, 4, 6, sample(seq(1,20,1), 20, replace = TRUE))
MOp_Cerebellum <- c(34, 45, 67, 78, 99, sample(seq(20,100,1), 20, replace = TRUE))
MOq_Cortex <- c(2, 5, 6, 17, 2, sample(seq(1,20,1), 20, replace = TRUE))
MOq_Striatum <- c(100, 101, 102, 106, 200, sample(seq(100,200,1), 20, replace = TRUE))
df <- data.frame(Sex, Phenotype, MOp_Amygdala, MOp_Thalamus, MOp_Cerebellum, MOq_Cortex, MOq_Striatum)
library(tidyverse)
glimpse(df)
#> Rows: 25
#> Columns: 7
#> $ Sex <chr> "F", "F", "M", "M", "F", "F", "F", "M", "M", "F", "F", …
#> $ Phenotype <chr> "Control", "Experimental", "Experimental", "Control", "…
#> $ MOp_Amygdala <dbl> 10, 15, 2, 6, 8, 16, 6, 14, 3, 2, 16, 20, 15, 15, 2, 8,…
#> $ MOp_Thalamus <dbl> 19, 12, 4, 4, 6, 14, 9, 12, 2, 9, 17, 17, 4, 7, 16, 9, …
#> $ MOp_Cerebellum <dbl> 34, 45, 67, 78, 99, 73, 21, 94, 30, 75, 54, 80, 48, 27,…
#> $ MOq_Cortex <dbl> 2, 5, 6, 17, 2, 6, 8, 5, 10, 4, 7, 14, 8, 1, 12, 11, 12…
#> $ MOq_Striatum <dbl> 100, 101, 102, 106, 200, 192, 193, 162, 121, 198, 109, …
以下是计算这三列每组平均值的一种方法:
df %>%
summarise(across(starts_with('MOp'), mean),
.by = c(Sex, Phenotype))
#> Sex Phenotype MOp_Amygdala MOp_Thalamus MOp_Cerebellum
#> 1 F Control 7.9 12.3 61.5
#> 2 F Experimental 14.6 13.2 52.6
#> 3 M Experimental 12.4 10.8 73.6
#> 4 M Control 7.2 5.4 50.4
创建于 2023-07-24,使用 reprex v2.0.2