R数据帧:逐组创建加权平均

问题描述 投票:0回答:1

我正在处理R数据帧

带列GROUP_COL | TIME| VALUE

。时间是有序的,值是数字,而col是我要对数据进行分组的分类变量。我的目标是

  • GROUP_COL变量组成的第一组
  • 然后,按TIME排序
  • 然后使用每行的公式value = 0.1 * previous_value + 0.9 * value计算每个组中值的加权平均值。如果没有先前的值,请保留该值不变。
  • 此加权值应存储在单独的列WEIGHTED中。

到目前为止,我尝试的是:Usng`dplyr,我使用lag()创建了一个先前值的向量

weighted_avg_with_previous <- function(.data, lag_weight=0.1) {
  # get previous values
  lag_val <- lag(.data$VALUE, n = 1L, default = 0, order_by = .data$TIME)

  # give each value a weight 0.9 for current value and 0.1 for previous value
  weighted = (1 -lag_weight) * .data$VALUE + lag_weight * lag_val
  return (weighted)
}

data <- data %>% 
  group_by(SALES_RESPONSIBILITY, PRODUCT_AREA, CURRENCY, FORECAST_TYPE) %>% 
  arrange(HORIZON, .by_group=TRUE) %>% 
  mutate(WEIGHTED_VALUE = weighted_avg_with_previous(0.1))

但是,mutate语句引发错误。如何使我的weighted_avg_with_previous函数在单个组上运行?

示例:

    GROUP | TIME| VALUE | WEIGHTED VALUE
    _____________________________________
     A    |  1  |   1   |     1
     A    |  2  |   2   |     1.9
     A    |  3  |   3   |     2.9
     A    |  4  |   4   |     3.9
     B    |  1  |   3   |     3
     B    |  2  |   7   |     6.6
     B    |  3  |   -4  |     -3.3
     ...

最好,朱莉娅

r group-by dplyr weighted-average
1个回答
3
投票
library(tidyverse)    
df <- structure(list(GROUP = c("A", "A", "A", "A", "B", "B", "B"),
    TIME = c(1L, 2L, 3L, 4L, 1L, 2L, 3L), VALUE = c(1L, 2L, 3L,
    4L, 3L, 7L, -4L)), row.names = c(NA, -7L),  class = c("tbl_df",
"tbl", "data.frame"))


 df %>%
      group_by(GROUP) %>%
      mutate(previous.value = lag(VALUE)) %>%
      mutate(weighted.value = ifelse(is.na(previous.value),VALUE, 0.1*previous.value + 0.9*VALUE)) %>%
      select(-previous.value)

第一个mutate()语句为滞后的value创建一个新变量,第二个语句创建weighted.value,该变量等于0.1*previous.value + 0.9*value,或者如果value为空,则等于previous.value

输出:

# A tibble: 7 x 4
# Groups:   GROUP [2]
  GROUP  TIME VALUE weighted.value
  <chr> <int> <int>          <dbl>
1 A         1     1            1
2 A         2     2            1.9
3 A         3     3            2.9
4 A         4     4            3.9
5 B         1     3            3
6 B         2     7            6.6
7 B         3    -4           -2.9
© www.soinside.com 2019 - 2024. All rights reserved.