我很好奇如何专门使用
{collapse}
package 来旋转长数据框。我喜欢该包的性能方面,但我发现有时很难用于更多中级数据操作(例如,tidyr::pivot_wider()
)
举个例子:
tbl <- tibble::tibble(
user_id = rep(1:3, each = 5),
a = rep(paste0("item", 1:5), times = 3),
b = sample(rnorm(1000), 15)
)
# A tibble: 15 x 3
user_id a b
<int> <chr> <dbl>
1 1 item1 0.474
2 1 item2 0.658
3 1 item3 -0.609
4 1 item4 -0.710
5 1 item5 -0.936
6 2 item1 -1.06
7 2 item2 -0.307
8 2 item3 -1.69
9 2 item4 0.669
10 2 item5 0.776
11 3 item1 -0.00244
12 3 item2 1.33
13 3 item3 -0.724
14 3 item4 -0.646
15 3 item5 1.69
并使用
{collapse}
将其变成这样:
tbl |> tidyr::pivot_wider(names_from="a", values_from="b")
# A tibble: 3 x 6
user_id item1 item2 item3 item4 item5
<int> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 -0.597 0.672 -0.396 1.44 -0.419
2 2 1.56 0.488 -0.980 0.648 -0.0903
3 3 -0.885 -0.675 0.376 1.02 -0.180
{崩溃} 2.0+ 获得了非常强大的
pivot()
功能:
library(collapse)
#> collapse 2.0.3, see ?`collapse-package` or ?`collapse-documentation`
tbl <- tibble::tibble(
user_id = rep(1:3, each = 5),
a = rep(paste0("item", 1:5), times = 3),
b = sample(rnorm(1000), 15)
)
pivot(tbl, "user_id", "b", "a", how = "w")
#> # A tibble: 3 × 6
#> user_id item1 item2 item3 item4 item5
#> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 0.429 -0.0629 -2.53 1.38 0.283
#> 2 2 -1.03 -0.0957 1.13 -0.521 0.162
#> 3 3 -0.390 1.97 -1.21 1.49 1.40
# Or
pivot(tbl, "user_id", names = "a", how = "w")
#> # A tibble: 3 × 6
#> user_id item1 item2 item3 item4 item5
#> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 0.429 -0.0629 -2.53 1.38 0.283
#> 2 2 -1.03 -0.0957 1.13 -0.521 0.162
#> 3 3 -0.390 1.97 -1.21 1.49 1.40
创建于 2023-10-24,使用 reprex v2.0.2