rowwise dplyr 中的变量列名称

Question

关于 SO 有很多类似的问题，但这些问题通常涉及应用简单的

sum

或

mean

，或者由于某些其他原因，情况可以大大简化。在这里，我想复制列表列中的键下的某些列的值。这是一个最小的例子：

设置

library(dplyr)
library(purrr)
library(rlang)
library(tibble)

keys <- c('a', 'b')

tbl <-
    tibble(
        x = list(
            list(f = 'foo', g = 'bar'),
            list(f = 'bar', g = 'foo')
        ),
        a = list(
            list(i = 1, j = 2),
            list(i = 4, j = 7)
        ),
        b = list(
            list(i = 5, j = 3),
            list(i = 2, j = 9)
        )
    )

这有效并且达到了我的愿望：

即在

的第一个元素中，在键
a
和
b
下，分别有
a
和
b
列的第一个值

tbl2 <-
    tbl %>%
    rowwise %>%
    mutate(
        x = list(x %>% inset2('a', a) %>% inset2('b', b))
    )

但是，假设我们有 12 个键，这意味着 12 个

inset2

调用。最好将它们一起处理或循环遍历它们。我尝试使用

purrr::reduce

进行此操作，但是，我找不到访问

reduce

中的源列的方法：

迭代键并尝试将它们用作

.data

上的字符索引：

tbl2 <-
    tbl %>%
    rowwise %>%
    mutate(
        x = list(
            reduce(
                keys,
                function(lst, key) {
                    inset2(lst, key, .data[[key]])
                },
                .init = x
            )
        )
    )

Error: object 'key' not found
7: quos(..., .ignore_empty = "all")
6: dplyr_quosures(...)
5: force(dots)
4: mutate_cols(.data, dplyr_quosures(...), by)
3: mutate.data.frame(., x = list(reduce(keys, function(lst, key) {
       inset2(lst, key, .data[[key]])
   }, .init = x)))
2: mutate(., x = list(reduce(keys, function(lst, key) {
       inset2(lst, key, .data[[key]])
   }, .init = x)))
1: tbl %>% rowwise %>% mutate(x = list(reduce(keys, function(lst,
       key) {
       inset2(lst, key, .data[[key]])
   }, .init = x)))

上述错误仅发生在

.data

中，

key

本身正如预期的那样，在函数内以字符串形式存在。我还尝试将

key

转换为符号：

tbl2 <-
    tbl %>%
    rowwise %>%
    mutate(
        x = list(
            reduce(
                keys,
                function(lst, key) {
                    inset2(lst, key, !!sym(key))
                },
                .init = x
            )
        )
    )

Error: object 'key' not found
9: is_symbol(x)
8: sym(key)
7: quos(..., .ignore_empty = "all")
6: dplyr_quosures(...)
5: force(dots)
4: mutate_cols(.data, dplyr_quosures(...), by)
3: mutate.data.frame(., x = list(reduce(keys, function(lst, key) {
       inset2(lst, key, !!sym(key))
   }, .init = x)))
2: mutate(., x = list(reduce(keys, function(lst, key) {
       inset2(lst, key, !!sym(key))
   }, .init = x)))
1: tbl %>% rowwise %>% mutate(x = list(reduce(keys, function(lst,
       key) {
       inset2(lst, key, !!sym(key))
   }, .init = x)))

此版本运行没有错误，但将实际符号（

，

，...）分配给列表，而不是行中的值：

tbl2 <-
    tbl %>%
    rowwise %>%
    mutate(
        x = list(
            reduce(
                keys,
                function(lst, key) {
                    inset2(lst, key, sym(key))
                },
                .init = x
            )
        )
    )

tbl2$x[[1]]$a
a  # this `a` is a symbol

然后我尝试先解析键，然后将值传递给函数，尽管我不确定下面的

val

包含什么。它运行时没有错误，但

中的所有值都将是

NULL

。我认为这意味着

!!!syms(keys)

返回

NULL

，因此

reduce2

执行零循环，并返回

NULL

。

tbl2 <-
    tbl %>%
    rowwise %>%
    mutate(
        x = list(
            reduce2(
                keys,
                !!!syms(keys),
                function(lst, key, val) {
                    inset2(lst, key, val)
                },
                .init = x
            )
        )
    )

最后我又回到了使用

keys

作为字符向量，并依赖于

.data

的想法。另外，一次性完成整个操作可能比逐个键移动元素更有效。所以我尝试提取所有元素并将它们移动

utils::modifyList

:

tbl2 <-
    tbl %>%
    rowwise %>%
    mutate(
        x = list(modifyList(x, .data[keys]))
    )

Error in `mutate()`:
ℹ In argument: `x = list(modifyList(x, .data[keys]))`.
ℹ In row 1.
Caused by error in `.data[keys]`:
! `[` is not supported by the `.data` pronoun, use `[[` or $ instead.
Run `rlang::last_trace()` to see where the error occurred.

此时，我找到了一个实际的解决方案，我将其作为答案发布。但我认为这是一个有趣的例子，我想知道是否有人想出了一个我错过的简单解决方案（上面的所有内容对我来说似乎太复杂和丑陋）。

Answer 1

这是使用

purrr::reduce

和

purrr::map2

的可能选项：

library(dplyr, warn = FALSE)
library(purrr)
library(tibble)
library(magrittr)

tbl2 <- tbl %>%
  rowwise() %>%
  mutate(
    x = list(x %>% inset2("a", a) %>% inset2("b", b))
  ) |>
  ungroup()

tbl3 <- reduce(
  keys,
  \(x, y) {
    x[["x"]] <- map2(x[["x"]], x[[y]], ~ c(.x, list(.y)) |>
      set_names(c(names(.x), y)))
    x
  },
  .init = tbl
)

identical(tbl2, tbl3)
#> [1] TRUE

rowwise dplyr 中的变量列名称

问题描述投票：0回答：1

1个回答

最新问题

rowwise dplyr 中的变量列名称

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1