我有一个计算值的函数,但在执行该函数之前我需要拆分 df.输出如下:
structure(list(ID = c("35124-54739-2024-02-24", "35124-54739-2024-02-24",
"35124-54739-2024-02-24", "35124-54739-2024-02-24", "35124-54739-2024-02-24",
"35124-54739-2024-02-24", "35124-54739-2024-02-24", "35124-54739-2024-02-24",
"35124-54739-2024-02-24", "35124-54739-2024-02-24", "35124-54739-2024-02-24",
"35124-54739-2024-02-24", "35124-54739-2024-02-24", "35124-54739-2024-02-24",
"35124-54739-2024-02-24", "35124-54739-2024-02-24", "35124-54739-2024-02-24",
"35124-54739-2024-02-24"), Book = c("bet365", "bet365", "bet365",
"bet365", "bet365", "bet365", "bet365", "bet365", "bet365", "bet365",
"bet365", "bet365", "bet365", "bet365", "bet365", "bet365", "bet365",
"bet365"), Home = c("Kentucky", "Kentucky", "Kentucky", "Kentucky",
"Kentucky", "Kentucky", "Kentucky", "Kentucky", "Kentucky", "Kentucky",
"Kentucky", "Kentucky", "Kentucky", "Kentucky", "Kentucky", "Kentucky",
"Kentucky", "Kentucky"), Away = c("Alabama", "Alabama", "Alabama",
"Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "Alabama",
"Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "Alabama",
"Alabama", "Alabama", "Alabama"), Team = c("Alabama", "Alabama",
"Kentucky", "Kentucky", "Alabama", "Alabama", "Kentucky", "Kentucky",
"Alabama", "Kentucky", "Alabama", "Kentucky", "Alabama", "Kentucky",
"Alabama", "Kentucky", "Alabama", "Kentucky"), Price = c(110L,
130L, -175L, -150L, 100L, 140L, -190L, -140L, -105L, -130L, -110L,
-110L, -130L, -105L, -135L, 100L, -140L, 105L), Points = c(1,
-1, 1, -1, 1.5, -1.5, 1.5, -1.5, 2, -2, 2.5, -2.5, 3, -3, 3.5,
-3.5, 4, -4)), row.names = c(NA, -18L), class = c("tbl_df", "tbl",
"data.frame"))
这是我用来尝试拆分的命令
ncaab_spread_all %>%
group_split(ID, Book, abs(Points))
理想情况下,我希望将小标题拆分为以下格式。理想情况下,“团队”列将包含每个团队的值,“积分”列将包含适当的积分值。
# A tibble: 2 × 8
ID Book Home Away Team Price Points `abs(Points)`
<chr> <chr> <chr> <chr> <chr> <int> <dbl> <dbl>
1 35124-54739-2024-02-24 bet365 Kentucky Alabama Alabama -105 2 2
2 35124-54739-2024-02-24 bet365 Kentucky Alabama Kentucky -130 -2 2
我遇到的问题是,某些团队在同一场比赛中具有多个积分值,如下所示。如您所见,阿拉巴马州有 1 和 -1,肯塔基州有 1 和 -1。结果,分割不准确。
# A tibble: 4 × 8
ID Book Home Away Team Price Points `abs(Points)`
<chr> <chr> <chr> <chr> <chr> <int> <dbl> <dbl>
1 35124-54739-2024-02-24 bet365 Kentucky Alabama Alabama 110 1 1
2 35124-54739-2024-02-24 bet365 Kentucky Alabama Alabama 130 -1 1
3 35124-54739-2024-02-24 bet365 Kentucky Alabama Kentucky -175 1 1
4 35124-54739-2024-02-24 bet365 Kentucky Alabama Kentucky -150 -1 1
我如何进行拆分,以便将上面的小标题拆分为如下?
# A tibble: 2 × 8
ID Book Home Away Team Price Points `abs(Points)`
<chr> <chr> <chr> <chr> <chr> <int> <dbl> <dbl>
1 35124-54739-2024-02-24 bet365 Kentucky Alabama Alabama 110 1 1
2 35124-54739-2024-02-24 bet365 Kentucky Alabama Kentucky -150 -1 1
# A tibble: 2 × 8
ID Book Home Away Team Price Points `abs(Points)`
1 35124-54739-2024-02-24 bet365 Kentucky Alabama Alabama 130 1 1
2 35124-54739-2024-02-24 bet365 Kentucky Alabama Kentucky -150 -1 1
我在故障排除期间添加了abs(点),它似乎给出了最好的结果,除了我遇到的情况。
如果有更好的方法不使用 group_split 我完全愿意采用不同的方法。
已经设计出替代方案。