计算数据框中持续的连胜

问题描述 投票:0回答:2

我有一组体育比赛的数据,其形式如下:

winner = ['A', 'A', 'B', 'C', 'A', 'C', 'C', 'B']
loser =  ['B', 'C', 'A', 'A', 'B', 'A', 'B', 'C']
P1 =     ['A', 'A', 'A', 'A', 'A', 'A', 'B', 'B']
P2 =     ['B', 'C', 'B', 'C', 'B', 'C', 'C', 'C']
P1_win = [ 1, 1, 0, 0, 1, 0, 0, 0]

df = pd.DataFrame({'winner': winner, 'loser': loser, 'P1':P1, 'P2':P2, 'P1_win':P1_win})
df

我想计算 P1 和 P2 的持续连胜。然而,当我这样做时,当 P_win == 0 时,连胜不会重置。

我用来计算条纹的代码是:

condition = df.P1_win.eq(0)
df['Reset'] = condition.groupby(df.P1_win).cumsum() #reset need to be 0. If P_win == 0, reset the line
df['P1_win_Streak'] = df.P1_win.mask(condition, 0).groupby([df.winner, df.Reset]).cumsum()

发生的情况是,每当一个 streak 结束时,0 就会成功输入到 streak 列中,但 streak 会从之前的值开始,如图所示:

在我的实际数据集中,它最终是这样的:

非常感谢任何帮助取消这个问题!

python pandas jupyter-notebook data-science
2个回答
1
投票

这可能就是你想要的:

# Create a variable to indicate consecutive groups
groups = (df['P1_win'] != df['P1_win'].shift(1)).cumsum()

# Reset the count when encountering a zero value
df['count'] = df.groupby(groups)['P1_win'].cumsum()

这会导致:

希望这有帮助。


1
投票

进行矢量化似乎很困难,但这可能适用于较小的情况(在迭代以使其更快之前<1 million row) dfs (you could also convert to numpy with

df.to_numpy
):

def calc_streaks(df: pd.DataFrame) -> pd.DataFrame:
    player_streaks = {}
    p1_streak = []
    p2_streak = []

    for _, row in df.iterrows():
        player_streaks[row['loser']] = 0
        player_streaks[row['winner']] = player_streaks.get(row['winner'], 0) + 1

        p1_streak.append(player_streaks[row['P1']])
        p2_streak.append(player_streaks[row['P2']])

    df['P1_streak'] = p1_streak
    df['P2_streak'] = p2_streak
    return df

输出:

  winner loser P1 P2  P1_win  P1_streak  P2_streak
0      A     B  A  B       1          1          0
1      A     C  A  C       1          2          0
2      B     A  A  B       0          0          1
3      C     A  A  C       0          0          1
4      A     B  A  B       1          1          0
5      C     A  A  C       0          0          2
6      C     B  B  C       0          0          3
7      B     C  B  C       0          1          0
© www.soinside.com 2019 - 2024. All rights reserved.