python:矢量化 Def 仅适用于第一个条件。后续循环不受影响

问题描述 投票:0回答:1

我有一个矢量化的定义:

    def selection_update_weights(df):
        # Define the selections for 'Win'
        selections_win = ["W & O 2.5 (both untested)", "Win (untested) & O 2.5", "Win & O 2.5 (untested)", "W & O 2.5", 
                          "W & O 1.5 (both untested)", "Win (untested) & O 1.5", "Win & O 1.5 (untested)", "W & O 1.5", 
                          "W & U 4.5 (both untested)", "Win (untested) & U 4.5", "Win & U 4.5 (untested)", "W & U 4.5", 
                          "W (untested)", "W"]
    
        # Create a boolean mask for the condition for 'Win'
        mask_win = (df['selection_match'] == "no match") & \
                   (df['selection'].isin(selections_win)) & \
                   (df['result_match'] == "no match") & \
                   (df['result'] != 'draw')
    
        # Apply the condition and update the 'Win' column
        df.loc[mask_win, 'Win'] = df.loc[mask_win, 'predicted_score_difference'] + 0.02
    
        # Define the selections for 'DNB'
        selections_DNB = ["DNB or O 2.5 (both untested)", "DNB (untested) or O 2.5", "DNB or O 2.5 (untested)",
                          "DNB or O 2.5", "DNB or O 1.5 (both untested)", "DNB (untested) or O 1.5", 
                          "DNB or O 1.5 (untested)", "DNB or O 1.5", "DNB (untested)", "DNB"]
    
        # Create a boolean mask for the condition for 'DNB'
        mask_DNB = ((df['selection_match'] == 'no match') & \
                    (df['selection'].isin(selections_DNB)) & \
                    (df['result_match'] == 'no match') & \
                    (df['result'] != 'draw'))
    
        # Apply the condition and update the 'DNB' column
        df.loc[mask_DNB, 'DNB'] = df.loc[mask_DNB, 'predicted_score_difference'] + 0.02
    
        # Define the selections for O 1.5'
        selections_O_1_5 = ["W & O 1.5 (both untested)", "Win (untested) & O 1.5", "Win & O 1.5 (untested)",
                            "W & O 1.5", "DNB or O 1.5 (both untested)", "DNB (untested) or O 1.5", 
                            "DNB or O 1.5 (untested)", "DNB or O 1.5", "O 1.5 (untested)", "O 1.5"]
    
        # Create a boolean mask for the condition for 'O 1.5'
        mask_O_1_5 = ((df['selection_match'] == 'no match') & \
                    (df['selection'].isin(selections_O_1_5)) & \
                    (df['total_score'] < 2))
    
        # Apply the condition and update the 'O 1.5' column
        df.loc[mask_O_1_5, 'O_1_5'] = df.loc[mask_O_1_5, 'predicted_total_score'] + 0.02
    
        # Define the selections for O 2.5'
        selections_O_2_5 = ["W & O 2.5 (both untested)", "Win (untested) & O 2.5", "Win & O 2.5 (untested)", 
                            "W & O 2.5", "DNB or O 2.5 (both untested)", "DNB (untested) or O 2.5",
                            "DNB or O 2.5 (untested)", "DNB or O 2.5", "O 2.5 (untested)", "O 2.5"]
    
        # Create a boolean mask for the condition for 'O 2.5'
        mask_O_2_5 = ((df['selection_match'] == 'no match') & \
                    (df['selection'].isin(selections_O_2_5)) & \
                    (df['total_score'] < 3))
    
        # Apply the condition and update the 'O 2.5' column
        df.loc[mask_O_2_5, 'O_2_5'] = df.loc[mask_O_2_5, 'predicted_total_score'] + 0.02
    
        # Define the selections for U 4.5'
        selections_U_4_5 = ["W & U 4.5 (both untested)", "Win (untested) & U 4.5", "Win & U 4.5 (untested)",
                            "W & U 4.5", "U 4.5 (untested)", "U 4.5"]
    
        # Create a boolean mask for the condition for 'O 2.5'
        mask_U_4_5 = ((df['selection_match'] == 'no match') & \
                    (df['selection'].isin(selections_U_4_5)) & \
                    (df['total_score'] > 4))
    
        # Apply the condition and update the 'O 2.5' column
        df.loc[mask_U_4_5, 'U_4_5'] = df.loc[mask_U_4_5, 'predicted_total_score'] - 0.02
    
        return df

首次运行作品:

但是后续循环不会产生任何变化。

虽然我有一个非常大的数据框,但列会部分更新。我也不知道为什么。

原始数据框不受影响。

如果我分解每个 if-else 但数据框太大并且行计算需要 20 分钟会有帮助吗

我通过以下方式应用它:

df = selection_update_weights(df)

首次运行作品:

home_score  away_score  total_score  score_difference  predicted_total_score  predicted_score_difference result predicted_result result_match  Win  DNB  O_1_5  O_2_5  U_4_5                  selection selection_match
3            2           0            2                 2              12.370528                   12.090888   home             home        match  1.1  0.7      2      3      4  W & O 2.5 (both untested)        no match
9            2           0            2                 2              11.439416                   10.291339   home             home        match  1.1  0.7      2      3      4  W & O 2.5 (both untested)        no match
10           2           0            2                 2              11.226599                   10.228954   home             home        match  1.1  0.7      2      3      4  W & O 2.5 (both untested)        no match
11           1           5            6                 4              12.069979                   10.194557   away             home     no match  1.1  0.7      2      3      4  W & O 2.5 (both untested)        no match
20           2           0            2                 2               9.808659                    9.049657   home             home        match  1.1  0.7      2      3      4  W & O 2.5 (both untested)        no match

当我运行 def 时

home_score  away_score  total_score  score_difference  predicted_total_score  predicted_score_difference result predicted_result result_match       Win  DNB     O_1_5     O_2_5  U_4_5                  selection selection_match
44           3           3            6                 0               8.748172                    8.135116   draw             home     no match  8.155116  0.7  2.000000  3.000000    4.0  W & O 2.5 (both untested)        no match
50           1           0            1                 1               8.605350                    7.932909   home             home        match  1.100000  0.7  8.625350  8.625350    4.0  W & O 1.5 (both untested)        no match
57           1           1            2                 0               7.510030                    7.750101   draw             home     no match  7.770101  0.7  2.000000  7.530030    4.0  W & O 1.5 (both untested)        no match
62           0           1            1                 1               8.895045                    7.710740   away             away        match  1.100000  0.7  8.915045  8.915045    4.0  W & O 1.5 (both untested)        no match
85           1           0            1                 1               8.099853                    7.444815   home             home        match  1.100000  0.7  8.119853  8.119853    4.0  W & O 1.5 (both untested)        no match

但是后续循环不会产生任何变化。

虽然我有一个非常大的数据框,但这个片段是权重没有更新的地方。权重部分更新。我也不知道为什么。

df.head():
home_score  away_score  total_score  score_difference  predicted_total_score  predicted_score_difference result predicted_result result_match  Win  DNB  O_1_5     O_2_5  U_4_5                  selection selection_match
44           3           3            6                 0               8.748172                    8.135116   draw             home     no match  1.1  0.7    2.0  3.000000    4.0  W & O 2.5 (both untested)        no match
50           1           0            1                 1               8.605350                    7.932909   home             home        match  1.1  0.7    2.0  8.625350    4.0  W & O 1.5 (both untested)        no match
57           1           1            2                 0               7.510030                    7.750101   draw             home     no match  1.1  0.7    2.0  7.530030    4.0  W & O 1.5 (both untested)        no match
62           0           1            1                 1               8.895045                    7.710740   away             away        match  1.1  0.7    2.0  8.915045    4.0  W & O 1.5 (both untested)        no match
85           1           0            1                 1               8.099853                    7.444815   home             home        match  1.1  0.7    2.0  8.119853    4.0  W & O 1.5 (both untested)        no match

所以当我应用它时:

df = selection_update_weights(df)

理想情况下我应该得到

home_score  away_score  total_score  score_difference  predicted_total_score  predicted_score_difference result predicted_result result_match       Win  DNB    O_1_5    O_2_5    U_4_5                       selection  selection_match
          3           3            6                 0               8.748172                    8.135116   draw             home     no match  8.155116  0.7       2.0         3      4.0      W & O 2.5 (both untested)        no match
          1           0            1                 1               8.605350                    7.932909   home             home        match  1.100000  0.7  8.625350  8.625350      4.0      W & O 1.5 (both untested)        no match
          1           1            2                 0               7.510030                    7.750101   draw             home     no match  7.770101  0.7       2.0  7.530030      4.0      W & O 1.5 (both untested)        no match
          0           1            1                 1               8.895045                    7.710740   away             away        match  1.100000  0.7  8.915045  8.915045      4.0      W & O 1.5 (both untested)        no match
          1           0            1                 1               8.099853                    7.444815   home             home        match  1.100000  0.7  8.119853  8.119853      4.0      W & O 1.5 (both untested)        no match

然而,这并没有发生,原始数据帧不受影响。

如果我分解每个 if-else 但数据框太大并且行计算需要 20 分钟会有帮助吗

python dataframe vectorization
1个回答
0
投票

在第二个循环中,仅更新 Win 和 O_2_5 列。 Win 根据预测得分差值的函数进行更新 O_2_5 根据预测总得分值的函数进行更新

predicted_score_difference 和predicted_total_score 在selection_update_weights 中永远不会改变,因此我们可以将predicted_score_difference 和predicted_total_score 视为selection_update_weights 中的常量。由于您多次调用一个方法并根据常量更新其值,因此它们永远不会有另一个值。

我不确定你为什么要多次调用selection_update_weights,但也许你应该更新Win、O_2_5(以及其余其他列)本身,或者更新selection_update_weights函数中的predicted_score_difference和predicted_total_score

© www.soinside.com 2019 - 2024. All rights reserved.