每年,我想创建两个新列
temp_count
和 rh_count
分别计算每个 temp_catog
和 humidity_catog
列中出现的次数。这 How to count how many values per level in a given factor? answers if you group by one variable, but I would like to use group_by(year, humidity_catog, temp_catog)
.这是我的数据截图
我可以使用以下代码创建一个列
humidity_count
来计算每个类别humidity_catog
列中出现的次数。
df <- group_by(year, humidity_catog) %>%
summarize(humidity_count = n())
这是输出
但是我想在同一个数据框中创建另一列
temp_count
来计算每个类别temp_count
列的数量。我怎样才能做到这一点?这是我通过 dput 函数创建的数据的可重现示例。
df <- structure(
list(
year = structure(
c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L),
.Label = c(
"2006",
"2007",
"2012",
"2013",
"2014",
"2014_c",
"2015_a",
"2015_b",
"2016",
"2017",
"2020"
),
class = "factor"
),
min_rh = c(47.9, 49, 44.7, 40.2, 50, 52.3, 51.5, 82.8, 73.8,
47.1),
min_temp = c(12.4, 14.3, 15.1, 16.1, 12.7, 16.1, 14.4,
15.1, 11.8, 9.5),
temp_catog = structure(
c(2L, 2L, 3L, 3L,
2L, 3L, 2L, 3L, 2L, 2L),
.Label = c("T1(<=8)", "T2(>8, <=15)",
"T3(>15)"),
class = "factor"
),
humidity_catog = structure(
c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L),
.Label = c("RH1(<=65)",
"RH2(>65)"),
class = "factor"
)
),
class = c("grouped_df",
"tbl_df", "tbl", "data.frame"),
row.names = c(NA,-10L),
groups = structure(
list(
year = structure(
1L,
.Label = c(
"2006",
"2007",
"2012",
"2013",
"2014",
"2014_c",
"2015_a",
"2015_b",
"2016",
"2017",
"2020"
),
class = "factor"
),
.rows = structure(
list(1:10),
ptype = integer(0),
class = c("vctrs_list_of",
"vctrs_vctr", "list")
)
),
class = c("tbl_df", "tbl", "data.frame"),
row.names = c(NA,-1L),
.drop = TRUE
)
)
注意:我不想要独特的事件。我只想统计每个类别记录了多少次
不太确定OP如何合并两个汇总结果,但我们可以调用
mutate
而不是summarise
,顺序地将分组变量提供给.by
参数。
obs:玩具数据框按年份分组,我事先取消分组
library(dplyr) #requires dplyr 1.1.0 for the .by solution
df %>%
ungroup() %>%
mutate(rh_count = n(), .by = c(year, humidity_catog)) %>%
mutate(temp_count = n(), .by = c(year, temp_catog))
# A tibble: 10 × 7
year min_rh min_temp temp_catog humidity_catog rh_count temp_count
<fct> <dbl> <dbl> <fct> <fct> <int> <int>
1 2006 47.9 12.4 T2(>8, <=15) RH1(<=65) 8 6
2 2006 49 14.3 T2(>8, <=15) RH1(<=65) 8 6
3 2006 44.7 15.1 T3(>15) RH1(<=65) 8 4
4 2006 40.2 16.1 T3(>15) RH1(<=65) 8 4
5 2006 50 12.7 T2(>8, <=15) RH1(<=65) 8 6
6 2006 52.3 16.1 T3(>15) RH1(<=65) 8 4
7 2006 51.5 14.4 T2(>8, <=15) RH1(<=65) 8 6
8 2006 82.8 15.1 T3(>15) RH2(>65) 2 4
9 2006 73.8 11.8 T2(>8, <=15) RH2(>65) 2 6
10 2006 47.1 9.5 T2(>8, <=15) RH1(<=65) 8 6