分组数据框中所选行的平均值。

问题描述 投票:0回答:1

你能帮我解决以下问题吗?我想根据球队和赛季对我的DF进行分组,然后我想得到截至比赛日期的平均进球数。我想使用 rolling但我不知道怎么做,因为每一行都不一样。

DF:

Date      Home   Away    Season  Home_goals  Away_goals         
1.1.2019  Team 1 Team 2  2019    1           1
2.1.2019  Team 3 Team 4  2019    2           3
3.1.2019  Team 1 Team 3  2019    2           1  
2.1.2020  Team 1 Team 4  2020    3           4
4.1.2019  Team 1 Team 5  2019    1           3

预期的输出。

Date      Home   Away    Season  Home_goals  Away_goals  Mean_home_goals       
1.1.2019  Team 1 Team 2  2019    1           1           1
2.1.2019  Team 3 Team 4  2019    2           3           2
3.1.2019  Team 1 Team 3  2019    2           1           1.5((1+3)/2)  
2.1.2020  Team 1 Team 4  2020    3           4           3 (its new season)
4.1.2019  Team 1 Team 5  2019    1           3           1.33 ((1+3+1)/3) 

谢谢你

pandas group-by apply
1个回答
1
投票

如果你按日期排序,你可以将所有的东西按以下方式分组 HomeSeason,然后在上面计算一个扩张的平均值就可以了。

In [327]: df.sort_values("Date").groupby(["Home", "Season"])["Home_goals"].expanding().mean()
Out[327]:
Home    Season
Team 1  2019    0    1.000000
                2    1.500000
                4    1.333333
        2020    3    3.000000
Team 3  2019    1    2.000000
Name: Home_goals, dtype: float64

1
投票

你可以这样做。

groups = df.groupby(['Home','Season'])['Home_goals']
df['Mean_home_goalds'] = groups.cumsum()/groups.cumcount().add(1)

输出:

       Date    Home    Away  Season  Home_goals  Away_goals  Mean_home_goalds
0  1.1.2019  Team 1  Team 2    2019           1           1          1.000000
1  2.1.2019  Team 3  Team 4    2019           2           3          2.000000
2  3.1.2019  Team 1  Team 3    2019           2           1          1.500000
3  2.1.2020  Team 1  Team 4    2020           3           4          3.000000
4  4.1.2019  Team 1  Team 5    2019           1           3          1.333333
© www.soinside.com 2019 - 2024. All rights reserved.