我需要获取 y 列中每个 1 两侧 x 列值的总和。 y 中每次出现 1 之前和之后的窗口为 1:4 和 5:8。但是,如果窗口超出 df 的限制,则返回 0。这是一个问题,因为 0 可能是有效结果。完整数据有多个组(id),但希望这个代表足以帮助创建一个可扩展的解决方案。
我以前从未使用过
zoo
,因此无法弄清楚如何合并 na.rm = TRUE
并在窗口完全超出 df 范围的情况下返回 NA。解决方案不必使用 zoo
,但如果可能的话,我更喜欢基于动词的答案。
数据和包:
library(dplyr)
library(zoo)
set.seed(1)
df <- data.frame(id = rep("A", 40),
x = sample(0:3, 40, replace = TRUE),
y = 0)
df[c(2, 8, 9, 30, 33, 39), "y"] <- 1
我尝试过的:
w <- 4
df %>%
mutate(bf2 = ifelse(y == 1, rollapply(lag(x, 5), width = w, sum, fill = NA, align = "right", partial = TRUE, na.rm = TRUE), NA),
bf1 = ifelse(y == 1, rollapply(lag(x, 1), width = w, sum, fill = NA, align = "right", partial = TRUE, na.rm = TRUE), NA),
af1 = ifelse(y == 1, rollapply(lead(x, 1), width = w, sum, fill = NA, align = "left", partial = TRUE, na.rm = TRUE), NA),
af2 = ifelse(y == 1, rollapply(lead(x, 5), width = w, sum, fill = NA, align = "left", partial = TRUE, na.rm = TRUE), NA)) %>%
filter(y == 1)
id x y bf2 bf1 af1 af2
1 A 3 1 0 0 3 6
2 A 2 1 5 3 6 1
3 A 1 1 5 5 5 2
4 A 1 1 2 1 5 7
5 A 0 1 1 3 8 4
6 A 1 1 5 7 1 0
所需输出:
id x y bf2 bf1 af1 af2
1 A 3 1 NA 0 3 6
2 A 2 1 5 3 6 1
3 A 1 1 5 5 5 2
4 A 1 1 2 1 5 7
5 A 0 1 1 3 8 4
6 A 1 1 5 7 1 NA
您可以随时在您的条件中添加检查:
df %>%
mutate(bf2 = ifelse(y & row_number() > 5, rollapply(lag(x, 5), width = w, sum, fill = NA, align = "right", partial = TRUE, na.rm = TRUE), NA),
bf1 = ifelse(y & row_number() > 1, rollapply(lag(x, 1), width = w, sum, fill = NA, align = "right", partial = TRUE, na.rm = TRUE), NA),
af1 = ifelse(y & row_number() + 1 <= n(), rollapply(lead(x, 1), width = w, sum, fill = NA, align = "left", partial = TRUE, na.rm = TRUE), NA),
af2 = ifelse(y & row_number() + 5 <= n(), rollapply(lead(x, 5), width = w, sum, fill = NA, align = "left", partial = TRUE, na.rm = TRUE), NA)) |>
filter(y == 1)
输出:
id x y bf2 bf1 af1 af2
1 A 3 1 NA 0 3 6
2 A 2 1 5 3 6 1
3 A 1 1 5 5 5 2
4 A 1 1 2 1 5 7
5 A 0 1 1 3 8 4
6 A 1 1 5 7 1 NA