计算一组列的行加权和

问题描述 投票:0回答:1

比方说,我有以下数据框:

> library(tidyverse)
> dd <- tibble(a = rep(1,10), b = rep(1,10), c = rep(1,10))
> dd
# A tibble: 10 × 3
       a     b     c
   <dbl> <dbl> <dbl>
 1     1     1     1
 2     1     1     1
 3     1     1     1
 4     1     1     1
 5     1     1     1
 6     1     1     1
 7     1     1     1
 8     1     1     1
 9     1     1     1
10     1     1     1

和权重向量:

> weight <- c(1, 5, 10)
> weight
[1]  1  5 10

当我想一起计算数据框所有列的行加权总和时,我这样做:

> dd %>% mutate(m = rowSums(map2_dfc(dd, weight,`*`)))
# A tibble: 10 × 4
       a     b     c     m
   <dbl> <dbl> <dbl> <dbl>
 1     1     1     1    16
 2     1     1     1    16
 3     1     1     1    16
 4     1     1     1    16
 5     1     1     1    16
 6     1     1     1    16
 7     1     1     1    16
 8     1     1     1    16
 9     1     1     1    16
10     1     1     1    16

但我不知道如何计算数据框子集的行加权和。我尝试了下面的代码,但结果很乱:

> dd %>% rowwise() %>% mutate(m = rowwise(map2_dfc(c_across(b:c), weight[2:3],`*`)))
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
# A tibble: 10 × 4
# Rowwise: 
       a     b     c m$...1 $...2
   <dbl> <dbl> <dbl>  <dbl> <dbl>
 1     1     1     1      5    10
 2     1     1     1      5    10
 3     1     1     1      5    10
 4     1     1     1      5    10
 5     1     1     1      5    10
 6     1     1     1      5    10
 7     1     1     1      5    10
 8     1     1     1      5    10
 9     1     1     1      5    10
10     1     1     1      5    10

有人可以给我一个关于如何解决这个问题的提示吗?非常感谢。

r dplyr tidyverse purrr
1个回答
0
投票

我们可以为'weight'创建一个命名向量,循环

across
列'b'到'c',根据列名(
cur_column()
)对'weight'值进行子集,相乘并得到
rowSums

library(dplyr)
names(weight) <- names(dd)
dd %>% 
   mutate(m = rowSums(across(b:c,  ~ .x * weight[cur_column()])))

-输出

# A tibble: 10 × 4
       a     b     c     m
   <dbl> <dbl> <dbl> <dbl>
 1     1     1     1    15
 2     1     1     1    15
 3     1     1     1    15
 4     1     1     1    15
 5     1     1     1    15
 6     1     1     1    15
 7     1     1     1    15
 8     1     1     1    15
 9     1     1     1    15
10     1     1     1    15

或者如果我们想使用

rowwise
(不推荐,因为它比较慢)

dd %>% 
  rowwise %>%
  mutate(m = sum(c_across(b:c) * weight[2:3])) %>%
  ungroup

或与

base R

dd$m <-  rowSums(dd[2:3] * weight[2:3][col(dd[2:3])])
© www.soinside.com 2019 - 2024. All rights reserved.