你能帮我解决以下问题吗?我想根据球队和赛季对我的DF进行分组,然后我想得到截至比赛日期的平均进球数。我想使用 rolling
但我不知道怎么做,因为每一行都不一样。
DF:
Date Home Away Season Home_goals Away_goals
1.1.2019 Team 1 Team 2 2019 1 1
2.1.2019 Team 3 Team 4 2019 2 3
3.1.2019 Team 1 Team 3 2019 2 1
2.1.2020 Team 1 Team 4 2020 3 4
4.1.2019 Team 1 Team 5 2019 1 3
预期的输出。
Date Home Away Season Home_goals Away_goals Mean_home_goals
1.1.2019 Team 1 Team 2 2019 1 1 1
2.1.2019 Team 3 Team 4 2019 2 3 2
3.1.2019 Team 1 Team 3 2019 2 1 1.5((1+3)/2)
2.1.2020 Team 1 Team 4 2020 3 4 3 (its new season)
4.1.2019 Team 1 Team 5 2019 1 3 1.33 ((1+3+1)/3)
谢谢你
如果你按日期排序,你可以将所有的东西按以下方式分组 Home
和 Season
,然后在上面计算一个扩张的平均值就可以了。
In [327]: df.sort_values("Date").groupby(["Home", "Season"])["Home_goals"].expanding().mean()
Out[327]:
Home Season
Team 1 2019 0 1.000000
2 1.500000
4 1.333333
2020 3 3.000000
Team 3 2019 1 2.000000
Name: Home_goals, dtype: float64
你可以这样做。
groups = df.groupby(['Home','Season'])['Home_goals']
df['Mean_home_goalds'] = groups.cumsum()/groups.cumcount().add(1)
输出:
Date Home Away Season Home_goals Away_goals Mean_home_goalds
0 1.1.2019 Team 1 Team 2 2019 1 1 1.000000
1 2.1.2019 Team 3 Team 4 2019 2 3 2.000000
2 3.1.2019 Team 1 Team 3 2019 2 1 1.500000
3 2.1.2020 Team 1 Team 4 2020 3 4 3.000000
4 4.1.2019 Team 1 Team 5 2019 1 3 1.333333