如何在循环遍历其行时替换原始数据框中的值?

问题描述 投票:0回答:1

我知道循环遍历df中的行是很不好的,但是我有一列包含几百行的列表,我需要在其中修改列表中的每个元素。我很难使用.str.replace()/。strip()来处理所有额外的空间等。这是输入:

import pandas as pd

input_19 = [{'name':'Hector', 'team_position':'forward', 'player_traits':Finesse Shot, Speed Dribbler (CPU AI Only)}, {'name':'Bysim', 'team_position':'forward', 'player_traits':Long Shot Taker (CPU AI Only)}, {'name':'Nicolas', 'team_position':'defender', 'player_traits':Beat Offside Trap, Finesse Shot}]

input_19 = [{'name':'Hector', 'team_position':'forward', 'player_traits':'Finesse Shot, Speed Dribbler (CPU AI Only)'}, {'name':'Bysim', 'team_position':'forward', 'player_traits':'Long Shot Taker (CPU AI Only)'}, {'name':'Nicolas', 'team_position':'defender', 'player_traits':'Beat Offside Trap, Finesse Shot'}]

input_20 = [{'name':'Johann', 'team_position':'gk', 'player_traits':'GK Long Throw'}, {'name':'Winston', 'team_position':'defender', 'player_traits':'Dives Into Tackles (CPU AI Only)'}, {'name':'Petr', 'team_position':'forward', 'player_traits':'Flair, Long Shot Taker (CPU AI Only)'}]

df_19 = pd.DataFrame(input_19)
df_20 = pd.DataFrame(input_20)

输出:

df_19:

    name     player_traits                               team_position
0   Hector   Finesse Shot, Speed Dribbler (CPU AI Only)  forward
1   Bysim    Long Shot Taker (CPU AI Only)               forward
2   Nicolas  Beat Offside Trap , Finesse Shot            defender

df_20:

    name     player_traits                               team_position
0   Johann   GK Long Throw                               gk
1   Winston  Dives Into Tackles (CPU AI Only)            defender
2   Petr     Flair,  Long Shot Taker (CPU AI Only)       forward

如上所述,两个df中的'player_traits'列都需要进行字符串修改,因此我可以计算它们的出现频率。我想在原始df中进行修改(按年份),因此我可以通过使用'team_position'进行过滤来创建新的df,并使用Counter查找每个特征/元素的总数。这是我的代码,但是我不确定如何将新的'temp_list'分配到原始df中的适当位置,因为.loc与.replace()组合会修改数据帧的一部分,而.replace()对于dfs,仅接受字符串参数:

df_list = [df_19, df_20]

for df in df_list:
    for lst,i in zip(df['player_traits'].values, range(len(df['player_traits'].values))):
        temp_list = []
        if type(lst) != float:
            lst = lst.replace('(CPU AI Only)',"")
            lst = lst.split(",")
            for x in lst:
                x = x.strip()
                temp_list.append(x)
         # df[location of original value in original df] = temp_list
         # something like:
         # df[i, 'player_traits'] = temp_list

如何完成此代码,使我可以使用修改后的列表修改原始df值?

python pandas dataframe replace
1个回答
0
投票
df['player_traits'] = df['player_traits'].apply(my_function)

import pandas as pd

def my_function(lst):
    temp_list = []
    if type(lst) != float:
        lst = lst.replace('(CPU AI Only)',"")
        lst = lst.split(",")
        for x in lst:
            x = x.strip()
            temp_list.append(x)
    return temp_list


input_19 = [{'name':'Hector', 'team_position':'forward', 'player_traits':'Finesse Shot, Speed Dribbler (CPU AI Only)'}, {'name':'Bysim', 'team_position':'forward', 'player_traits':'Long Shot Taker (CPU AI Only)'}, {'name':'Nicolas', 'team_position':'defender', 'player_traits':'Beat Offside Trap, Finesse Shot'}]
input_20 = [{'name':'Johann', 'team_position':'gk', 'player_traits':'GK Long Throw'}, {'name':'Winston', 'team_position':'defender', 'player_traits':'Dives Into Tackles (CPU AI Only)'}, {'name':'Petr', 'team_position':'forward', 'player_traits':'Flair, Long Shot Taker (CPU AI Only)'}]

df_19 = pd.DataFrame(input_19)
df_20 = pd.DataFrame(input_20)

df_list = [df_19, df_20]

for df in df_list:
    df['player_traits'] = df['player_traits'].apply(my_function)

print(df_19)
print(df_20)
© www.soinside.com 2019 - 2024. All rights reserved.