我正在处理水质数据,其中包含采样地点、采样日期以及每个采样的结果。每当样本结果大于 104 时,就会触发建议。单个咨询计数一次,但会持续多少天,直到新样本数据反映等于或低于 104 的结果为止。咨询计数的长度从触发咨询的第一天开始,一直计数到,但不计算包括结果等于或低于 104 的日期。我正在寻找一种方法来计算每个位置的建议总数以及建议的平均长度(以天为单位)。谢谢!!
library(dplyr)
df %>%
summarize(advisory_period = sum(RESULT > 104), .by = SITE_ID) %>%
bind_rows(
df %>%
summarize(advisory_period = sum(RESULT > 104), .by = SITE_ID) %>%
summarize(SITE_ID = 'average_period',
advisory_period = mean(advisory_period)
))
#> # A tibble: 4 × 2
#> SITE_ID advisory_period
#> <chr> <dbl>
#> 1 ARA001 4
#> 2 BRA001 3
#> 3 CRA001 5
#> 4 average_period 4
df <- structure(list(SITE_ID = c("ARA001", "ARA001", "ARA001", "ARA001",
"ARA001", "ARA001", "ARA001", "BRA001", "BRA001", "BRA001", "BRA001",
"BRA001", "BRA001", "BRA001", "CRA001", "CRA001", "CRA001", "CRA001",
"CRA001", "CRA001", "CRA001", "CRA001", "CRA001"), RESULT = c(13,
107, 10, 115, 120, 110, 80, 5, 5, 112, 105, 5, 118, 5, 154, 180,
2000, 1543, 103, 100, 80, 112, 5), DATE = c("1/3/2023", "1/17/2023",
"1/30/2023", "2/13/2023", "2/27/2023", "3/6/2023", "3/7/2023",
"1/3/2023", "1/18/2023", "1/25/2023", "2/7/2023", "2/17/2023",
"3/28/2023", "3/7/2023", "1/2/2023", "1/5/2023", "1/6/2023",
"1/7/2023", "1/9/2023", "2/6/2023", "2/17/2023", "2/21/2023",
"3/6/2023")), row.names = c(NA, -23L), spec = structure(list(
cols = list(`SITE ID` = structure(list(), class = c("collector_character",
"collector")), RESULT = structure(list(), class = c("collector_double",
"collector")), DATE = structure(list(), class = c("collector_character",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), delim = ","), class = "col_spec"), class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"))
创建于 2024-03-21,使用 reprex v2.0.2