我的数据框有 2 个分类变量,其中一个的层次结构低于另一个。我想使用 dplyr 计算子类别中所有行的数值总和。
提前感谢所有可以帮助我的人!这是我开始的数据框:
transportation <- data.frame(
Country = c("A", "A", "A", "B", "B", "B"),
Mode = c("Car", "Train", "Plane", "Car", "Train", "Plane"),
Energy = c(10000, 9000, 20000, 200000, 160000, 450000)
)
这是我想要最终得到的数据框:
country_sum <- data.frame(
Country = c("A", "A", "A", "B", "B", "B"),
Mode = c("Car", "Train", "Plane", "Car", "Train", "Plane"),
Energy = c(10000, 9000, 20000, 200000, 160000, 450000),
country_sum = c(39000, 39000, 39000, 810000, 810000, 810000)
)
dplyr > 1.1.0
:
library(dplyr)
transportation %>%
mutate(country_sum = sum(Energy), .by=Country)
首先按 Country
分组,然后按
mutate
与
sum
:
library(dplyr)
transportation %>%
group_by(Country) %>%
mutate(country_sum = sum(Energy))
Country Mode Energy country_sum
<chr> <chr> <dbl> <dbl>
1 A Car 10000 39000
2 A Train 9000 39000
3 A Plane 20000 39000
4 B Car 200000 810000
5 B Train 160000 810000
6 B Plane 450000 810000
ave
。
dplyr::mutate(transportation, c_sum=ave(Energy, Country, FUN=sum))
# Country Mode Energy c_sum
# 1 A Car 10000 39000
# 2 A Train 9000 39000
# 3 A Plane 20000 39000
# 4 B Car 200000 810000
# 5 B Train 160000 810000
# 6 B Plane 450000 810000