我有一个网球数据集,如下所示:
tourney_id = ['French Open 2018','French Open 2018','Wimbledon 2018','Wimbledon 2018','Australian Open 2019','Australian Open 2019','US Open 2019','US Open 2019']
player_name = ['Novak Djokovic','Roger Federer','Andy Murray','Rafael Nadal','John Isner','Novak Djokovic','Andy Murray','Roger Federer']
match_num = [103, 103, 217, 217, 104, 104, 243, 243]
df = pd.DataFrame(list(zip(tourney_id, player_name, match_num)),
columns =['TournamentID','Name','MatchID'])
我想创建一个字典,其中键是玩家,项目也是玩家(对手)。因此,根据我的数据集,它看起来如下所示:
{'Novak Djokovic': ['Roger Federer','John Isner'],
'Roger Federer': ['Novak Djokovic','Andy Murray'],
'Andy Murray': ['Rafael Nadal','Roger Federer'],
'Rafael Nadal': ['Andy Murray'],
'John Isner': ['Novak Djokovic']}
当 TournamentID 和 MatchID 的值相同时,我想识别已经对战过的玩家。
我尝试的最后一件事是:
df.set_index(['TournamentID','MatchID'])['Name'].to_dict()
但这并不是我想要的。
有人可以帮我指出正确的方向吗?
谢谢!
opponents_dict = {key: group['Name'].tolist() for key, group in df.groupby(['TournamentID', 'MatchID'])['Name'] if len(group) > 1}
打印(opponents_dict)