我以这个为例(Calculating moving average),已将其成功合并到我的代码中。我需要计算滚动平均值和滚动中位数(我已经完成了),但是我的数据集非常庞大,我需要添加一个辅助变量来对此进行过滤。在下面的示例中,他们计算10天数据集的滚动平均值。如果他们在不同的位置有10天,怎么办?我们需要根据这些不同的位置计算10天的滚动平均值?
library(tidyverse)
library(zoo)
some_data = tibble(day = 1:10)
# cma = centered moving average
# tma = trailing moving average
some_data = some_data %>%
mutate(roll_mean = rollmean(day, k = 3, fill = NA)) %>%
mutate(roll_median = rollmedian(day, k = 3, fill = NA, align = "right"))
some_data
您可以按位置分组:
library(tidyverse)
library(zoo)
some_data <- rbind(tibble(day = 1:5,location = c(rep("A",5))),
tibble(day = 1:5,location = c(rep("B",5))))
some_data <- some_data %>% group_by(location) %>%
mutate(roll_mean_left = rollmean(day, k = 3, fill = NA, align='left'),
roll_mean_right = rollmean(day, k = 3, fill = NA, align='center'),
roll_median_center = rollmedian(day, k = 3, fill = NA, align = 'right'))
some_data
滚动功能会在每个位置重新初始化。注意滚动窗口如何根据align
参数移动:
day location roll_mean_left roll_mean_right roll_median_center
<int> <chr> <dbl> <dbl> <dbl>
1 1 A 2 NA NA
2 2 A 3 2 NA
3 3 A 4 3 2
4 4 A NA 4 3
5 5 A NA NA 4
6 1 B 2 NA NA
7 2 B 3 2 NA
8 3 B 4 3 2
9 4 B NA 4 3
10 5 B NA NA 4