假设我有一个具有以下结构的数据框:
fac1 fac2 fac3 val
1 Apple Fresh Red 2
2 Apple Old Red 3
3 Apple Hazard Red 1
4 Banana Fresh Yellow 4
5 Banana Old Yellow 5
6 Banana Hazard Yellow 1
7 Berry Fresh Purple 1
8 Berry Old Purple 1
9 Berry Hazard Purple 3
我想对
val
的每个因子水平求和 fac2
等于 Fresh
或 Old
的那些行的 fac1
的总和,并得出这样的数据框:
fac1 fac3 sum.freshold
1 Apple Red 5
2 Banana Yellow 9
3 Berry Purple 2
此外,我想通过字符指定因子级别,而不是因子级别整数。
这是例子:
mydf <- data.frame(fac1 = c(rep("Apple", 3), rep("Banana", 3), rep("Berry", 3)),
fac2 = rep(c("Fresh", "Old", "Hazard"), 3),
fac3 = c(rep("Red", 3), rep("Yellow", 3), rep("Purple", 3)),
val = c(2,3,1,4,5,1,1,1,3),
stringsAsFactors = T)
我的一次尝试不起作用,也不会创建 data.frame:
tapply(mydf$val, mydf$fac1, function(x) {x[mydf$fac2 == "Fresh"] + x[mydf$fac2 == "Old"]})
使用
aggregate
的方法
aggregate(val ~ fac1 + fac3, mydf[mydf$fac2 %in% c("Fresh", "Old"),], sum)
fac1 fac3 val
1 Berry Purple 2
2 Apple Red 5
3 Banana Yellow 9