我想按R中的组(在这种情况下为相同的属和种)获得标准偏差。但是,我的某些组由n = 1组成,因此我无法计算标准差。
这里是一个与我的真实数据集相似的随机数据集
x = structure(list(V1 = structure(c(1L, 2L, 2L, 3L, 3L), .Label = c("Genus1",
"Genus2", "Genus3"), class = "factor"), V2 = structure(c(1L,
2L, 2L, 3L, 3L), .Label = c("Species1", "Species2", "Species3"
), class = "factor"), V3 = c(6.32, 8.43, 8.31, 9.29, 8.96)), class = "data.frame", row.names = c(NA, -5L))
所需的输出是这样的吗?不确定处理n = 1个小组的最佳方法。
V1 V2 V3
Genus1 Species1
Genus2 Species2 0.084852814
Genus2 Species2
Genus3 Species3 0.233345238
Genus3 Species3
Base R
aggregate(x$V3, x[,c("V1","V2")], sd)
# V1 V2 x
# 1 Genus1 Species1 NA
# 2 Genus2 Species2 0.08485281
# 3 Genus3 Species3 0.23334524
data.table
方法:
library(data.table)
as.data.table(x)[, .(sigma = sd(V3)), by = .(V1, V2)]
# V1 V2 sigma
# 1: Genus1 Species1 NA
# 2: Genus2 Species2 0.08485281
# 3: Genus3 Species3 0.23334524
使用dplyr
,我们将'V1','V2'分组,并在'V3'上应用sd
,以获得'V3'的标准偏差
library(dplyr)
x %>%
group_by(V1, V2) %>%
summarise(V3 = sd(V3))