使用 group_split 分割小标题有困难

问题描述 投票:0回答:1

我有一个计算值的函数,但在执行该函数之前我需要拆分 df.输出如下:

  structure(list(ID = c("35124-54739-2024-02-24", "35124-54739-2024-02-24", 
  "35124-54739-2024-02-24", "35124-54739-2024-02-24", "35124-54739-2024-02-24", 
  "35124-54739-2024-02-24", "35124-54739-2024-02-24", "35124-54739-2024-02-24", 
  "35124-54739-2024-02-24", "35124-54739-2024-02-24", "35124-54739-2024-02-24", 
  "35124-54739-2024-02-24", "35124-54739-2024-02-24", "35124-54739-2024-02-24", 
  "35124-54739-2024-02-24", "35124-54739-2024-02-24", "35124-54739-2024-02-24", 
  "35124-54739-2024-02-24"), Book = c("bet365", "bet365", "bet365", 
  "bet365", "bet365", "bet365", "bet365", "bet365", "bet365", "bet365", 
  "bet365", "bet365", "bet365", "bet365", "bet365", "bet365", "bet365", 
  "bet365"), Home = c("Kentucky", "Kentucky", "Kentucky", "Kentucky", 
  "Kentucky", "Kentucky", "Kentucky", "Kentucky", "Kentucky", "Kentucky", 
  "Kentucky", "Kentucky", "Kentucky", "Kentucky", "Kentucky", "Kentucky", 
  "Kentucky", "Kentucky"), Away = c("Alabama", "Alabama", "Alabama", 
  "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", 
  "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", 
  "Alabama", "Alabama", "Alabama"), Team = c("Alabama", "Alabama", 
  "Kentucky", "Kentucky", "Alabama", "Alabama", "Kentucky", "Kentucky", 
  "Alabama", "Kentucky", "Alabama", "Kentucky", "Alabama", "Kentucky", 
  "Alabama", "Kentucky", "Alabama", "Kentucky"), Price = c(110L, 
  130L, -175L, -150L, 100L, 140L, -190L, -140L, -105L, -130L, -110L, 
  -110L, -130L, -105L, -135L, 100L, -140L, 105L), Points = c(1, 
  -1, 1, -1, 1.5, -1.5, 1.5, -1.5, 2, -2, 2.5, -2.5, 3, -3, 3.5, 
  -3.5, 4, -4)), row.names = c(NA, -18L), class = c("tbl_df", "tbl", 
  "data.frame"))

这是我用来尝试拆分的命令

  ncaab_spread_all %>% 
    group_split(ID, Book, abs(Points))

理想情况下,我希望将小标题拆分为以下格式。理想情况下,“团队”列将包含每个团队的值,“积分”列将包含适当的积分值。

  # A tibble: 2 × 8
  ID                     Book   Home     Away    Team     Price Points `abs(Points)`
  <chr>                  <chr>  <chr>    <chr>   <chr>    <int>  <dbl>         <dbl>
  1 35124-54739-2024-02-24 bet365 Kentucky Alabama Alabama   -105      2             2
  2 35124-54739-2024-02-24 bet365 Kentucky Alabama Kentucky  -130     -2             2

我遇到的问题是,某些团队在同一场比赛中具有多个积分值,如下所示。如您所见,阿拉巴马州有 1 和 -1,肯塔基州有 1 和 -1。结果,分割不准确。

  # A tibble: 4 × 8
  ID                     Book   Home     Away    Team     Price Points `abs(Points)`
  <chr>                  <chr>  <chr>    <chr>   <chr>    <int>  <dbl>         <dbl>
  1 35124-54739-2024-02-24 bet365 Kentucky Alabama Alabama    110      1             1
  2 35124-54739-2024-02-24 bet365 Kentucky Alabama Alabama    130     -1             1
  3 35124-54739-2024-02-24 bet365 Kentucky Alabama Kentucky  -175      1             1
  4 35124-54739-2024-02-24 bet365 Kentucky Alabama Kentucky  -150     -1             1

我如何进行拆分,以便将上面的小标题拆分为如下?

  # A tibble: 2 × 8
  ID                     Book   Home     Away    Team     Price Points `abs(Points)`
  <chr>                  <chr>  <chr>    <chr>   <chr>    <int>  <dbl>         <dbl>
  1 35124-54739-2024-02-24 bet365 Kentucky Alabama Alabama    110      1             1
  2 35124-54739-2024-02-24 bet365 Kentucky Alabama Kentucky    -150     -1             1

  # A tibble: 2 × 8
  ID                     Book   Home     Away    Team     Price Points `abs(Points)`
  1 35124-54739-2024-02-24 bet365 Kentucky Alabama Alabama  130     1             1
  2 35124-54739-2024-02-24 bet365 Kentucky Alabama Kentucky  -150     -1             1

我在故障排除期间添加了abs(点),它似乎给出了最好的结果,除了我遇到的情况。

如果有更好的方法不使用 group_split 我完全愿意采用不同的方法。

r tidyverse
1个回答
0
投票

已经设计出替代方案。

© www.soinside.com 2019 - 2024. All rights reserved.