`group_by` 更改 dplyr 中 `across` 中使用的列索引

Question

这可能是有意为之的行为，但我之前没有注意到。如果您有多个列，您想使用 dplyr 的

across

立即更改。并且您想通过索引引用列 - 如果您按变量对数据帧进行分组，则该索引会发生变化。

我的意思如下：

假设我们有这个数据框

df = data.frame(
  group = c("a", "a", "b"),
  val1 = 1:3,
  val2 = 2:4,
  val3 = 3:5
) 

df_grouped = df %>% 
  group_by(group)

然后我们想要更改第 2 列到第 4 列（val1 到 val3），我们可以这样做：

df %>% 
  mutate(across(2:4, ~"changed"))

结果是：

  group    val1    val2    val3
1     a changed changed changed
2     a changed changed changed
3     b changed changed changed

但是，当我对分组数据框执行相同操作时，我得到：

Error in `mutate()`:
ℹ In argument: `across(2:4, ~"changed")`.
Caused by error in `across()`:
! Can't subset columns past the end.
ℹ Location 4 doesn't exist.
ℹ There are only 3 columns.

所以我必须这样做

df_grouped %>% 
  mutate(across(1:3, ~"changed"))

据我所知，它只是删除了分组列。有什么办法可以防止吗？

Answer 1

您所看到的是使用整数作为列索引的结果。尽管该方法适用于未分组的数据，但在

?group_by

中没有提及

     ...: In ‘group_by()’, variables or computations to group by.
          Computations are always done on the ungrouped data frame. To
          perform computations on the grouped data, you need to use a
          separate ‘mutate()’ step before the ‘group_by()’.
          Computations are not allowed in ‘nest_by()’. In ‘ungroup()’,
          variables to remove from the grouping.

或在小插图中。使用

group_by

的参考使用变量或计算来完成您需要的操作。

原因是因为假设分组位于框架的可变列上，并且在

df_group

中，

group

列是不可变的，因为它是当前组。作为演示，我们可以通过以下方式了解数据中的表观列数：

df |>
  reframe(nc = ncol(pick(everything())))
#   nc
# 1  4
df_grouped |>
  reframe(nc = ncol(pick(everything())))
# # A tibble: 2 × 2
#   group    nc
#   <chr> <int>
# 1 a         3
# 2 b         3

因此，您对

2:4

的使用超出了列数。

处理方法：

使用变量名称，如

mutate(df_grouped, across(val1:val3, ~"changed"))

```
ungroup
```
在你变异之前，这在这里有效，因为你的操作相当良性，但要意识到你正在失去你真正在做的任何事情的“按组”逻辑：
```
ungroup(df_grouped) |>
  mutate(across(2:4, ~"changed"))
```
有点破解，但如果你真的不想在列名称中编码，你可以这样做
```
df_grouped %>%
  mutate(across(all_of(names(.)[2:4]), ~"changed"))
```

`group_by` 更改 dplyr 中 `across` 中使用的列索引

问题描述投票：0回答：1

1个回答

最新问题

`group_by` 更改 dplyr 中 `across` 中使用的列索引

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1