例如,我不建议使用命名元组,而是带字典的简单元组:
我具有关于足球比赛的以下数据定义:
Game = namedtuple('Game', ['Date', 'Home', 'Away', 'HomeShots', 'AwayShots',
'HomeBT', 'AwayBT', 'HomeCrosses', 'AwayCrosses',
'HomeCorners', 'AwayCorners', 'HomeGoals',
'AwayGoals', 'HomeXG', 'AwayXG'])
这里有一些例子:
[Game(Date=datetime.date(2018, 10, 21), Home='Everton', Away='Crystal Palace', HomeShots='21', AwayShots='6', HomeBT='22', AwayBT='13', HomeCrosses='21', AwayCrosses='14', HomeCorners='10', AwayCorners='5', HomeGoals='2', AwayGoals='0', HomeXG='1.93', AwayXG='1.5'),
Game(Date=datetime.date(2019, 2, 27), Home='Man City', Away='West Ham', HomeShots='20', AwayShots='2', HomeBT='51', AwayBT='6', HomeCrosses='34', AwayCrosses='5', HomeCorners='12', AwayCorners='2', HomeGoals='1', AwayGoals='0', HomeXG='3.68', AwayXG='0.4'),
Game(Date=datetime.date(2019, 2, 9), Home='Fulham', Away='Man Utd', HomeShots='12', AwayShots='15', HomeBT='19', AwayBT='38', HomeCrosses='20', AwayCrosses='12', HomeCorners='5', AwayCorners='4', HomeGoals='0', AwayGoals='3', HomeXG='2.19', AwayXG='2.13'),
Game(Date=datetime.date(2019, 3, 9), Home='Southampton', Away='Tottenham', HomeShots='12', AwayShots='15', HomeBT='13', AwayBT='17', HomeCrosses='15', AwayCrosses='15', HomeCorners='1', AwayCorners='10', HomeGoals='2', AwayGoals='1', HomeXG='2.08', AwayXG='1.27'),
Game(Date=datetime.date(2018, 9, 22), Home='Man Utd', Away='Wolverhampton', HomeShots='16', AwayShots='11', HomeBT='17', AwayBT='17', HomeCrosses='26', AwayCrosses='13', HomeCorners='5', AwayCorners='4', HomeGoals='1', AwayGoals='1', HomeXG='0.62', AwayXG='1.12')]
还有两个几乎相同的功能,用于计算给定团队的主场和客场统计。
def calculate_home_stats(team, games):
"""
Calculates home stats for the given team.
"""
home_stats = defaultdict(float)
home_stats['HomeShotsFor'] = sum(int(game.HomeShots) for game in games if game.Home == team)
home_stats['HomeShotsAgainst'] = sum(int(game.AwayShots) for game in games if game.Home == team)
home_stats['HomeBoxTouchesFor'] = sum(int(game.HomeBT) for game in games if game.Home == team)
home_stats['HomeBoxTouchesAgainst'] = sum(int(game.AwayBT) for game in games if game.Home == team)
home_stats['HomeCrossesFor'] = sum(int(game.HomeCrosses) for game in games if game.Home == team)
home_stats['HomeCrossesAgainst'] = sum(int(game.AwayCrosses) for game in games if game.Home == team)
home_stats['HomeCornersFor'] = sum(int(game.HomeCorners) for game in games if game.Home == team)
home_stats['HomeCornersAgainst'] = sum(int(game.AwayCorners) for game in games if game.Home == team)
home_stats['HomeGoalsFor'] = sum(int(game.HomeGoals) for game in games if game.Home == team)
home_stats['HomeGoalsAgainst'] = sum(int(game.AwayGoals) for game in games if game.Home == team)
home_stats['HomeXGoalsFor'] = sum(float(game.HomeXG) for game in games if game.Home == team)
home_stats['HomeXGoalsAgainst'] = sum(float(game.AwayXG) for game in games if game.Home == team)
home_stats['HomeGames'] = sum(1 for game in games if game.Home == team)
return home_stats
def calculate_away_stats(team, games):
"""
Calculates away stats for the given team.
"""
away_stats = defaultdict(float)
away_stats['AwayShotsFor'] = sum(int(game.AwayShots) for game in games if game.Away == team)
away_stats['AwayShotsAgainst'] = sum(int(game.HomeShots) for game in games if game.Away == team)
away_stats['AwayBoxTouchesFor'] = sum(int(game.AwayBT) for game in games if game.Away == team)
away_stats['AwayBoxTouchesAgainst'] = sum(int(game.HomeBT) for game in games if game.Away == team)
away_stats['AwayCrossesFor'] = sum(int(game.AwayCrosses) for game in games if game.Away == team)
away_stats['AwayCrossesAgainst'] = sum(int(game.HomeCrosses) for game in games if game.Away == team)
away_stats['AwayCornersFor'] = sum(int(game.AwayCorners) for game in games if game.Away == team)
away_stats['AwayCornersAgainst'] = sum(int(game.HomeCorners) for game in games if game.Away == team)
away_stats['AwayGoalsFor'] = sum(int(game.AwayGoals) for game in games if game.Away == team)
away_stats['AwayGoalsAgainst'] = sum(int(game.HomeGoals) for game in games if game.Away == team)
away_stats['AwayXGoalsFor'] = sum(float(game.AwayXG) for game in games if game.Away == team)
away_stats['AwayXGoalsAgainst'] = sum(float(game.HomeXG) for game in games if game.Away == team)
away_stats['AwayGames'] = sum(1 for game in games if game.Away == team)
return away_stats
我想知道是否有一种方法可以对这两个功能进行抽象,然后将它们合并为一个功能,而无需创建if / else语句围墙来确定团队是在主场还是不在场,以及应该计算哪些字段。
具有更简洁的数据结构,可以编写更简单的代码。在这种情况下,您的数据已经包含重复项(例如,您同时拥有HomeShots
和AwayShots
。)>
这里有许多可能的结构答案。我将介绍一个不会带来太大变化的解决方案您的原始结构。
Statistics = namedtuple('Statistics', ['shots', 'BT', 'crosses', 'corners', 'goals', 'XG']) Game = namedtuple('Game', ['home', 'away', 'date', 'home_stats', 'away_stats'])
您可以这样使用(我这里没有包括所有统计信息,仅举几个例子):
def calculate_stats(games, away=False): def team_stats(game, field_name): if away: stats = game.away_stats else: stats = game.home_stats return stats._asdict()[field_name] def sum_on_field(field_name): return sum(team_stats(g, field_name) for g in games) return {f:sum_on_field(f) for f in Statistics._fields}
然后可以用来同时获得离开/本垒打的统计:
example_game_1 = Game( home='Burnley', away='Arsenal', date=datetime.now(), home_stats=Statistics(shots=12, BT=26, crosses=21, corners=4, goals=1, XG=1.73), away_stats=Statistics(shots=17, BT=26, crosses=22, corners=5, goals=3, XG=2.87), ) example_game_2 = Game( home='Burnley', away='Arsenal', date=datetime.now(), home_stats=Statistics(shots=1, BT=1, crosses=1, corners=1, goals=1, XG=1), away_stats=Statistics(shots=2, BT=2, crosses=2, corners=2, goals=2, XG=2), ) home_stats = calculate_stats([example_game_1, example_game_2]) away_stats = calculate_stats([example_game_1, example_game_2], away=True) print(home_stats) print(away_stats)
哪些印刷品:
{'shots': 13, 'BT': 27, 'crosses': 22, 'corners': 5, 'goals': 2, 'XG': 2.73} {'shots': 19, 'BT': 28, 'crosses': 24, 'corners': 7, 'goals': 5, 'XG': 4.87}
[处理此类数据时,通常最好使用专用工具,例如pandas。使用交互式工具(例如JupyterLab)也可能非常方便。
例如,我不建议使用命名元组,而是带字典的简单元组:
game=(datetime.date(2019, 5, 12), 'Burnley', 'Arsenal', '12', '17', '26', '26', '21', '22', '4', '5', '1', '3', '1.73', '2.87')
和一个映射字典:
numtostr={0: 'Date', 1: 'Home', 2: 'Away', 3: 'HomeShots', 4: 'AwayShots', 5: 'HomeBT', 6: 'AwayBT', 7: 'HomeCrosses', 8: 'AwayCrosses', 9: 'HomeCorners', 10: 'AwayCorners', 11: 'HomeGoals', 12: 'AwayGoals', 13: 'HomeXG'}
strtonum={'Date': 0, 'Home': 1, 'Away': 2, 'HomeShots': 3, 'AwayShots': 4, 'HomeBT': 5, 'AwayBT': 6, 'HomeCrosses': 7, 'AwayCrosses': 8, 'HomeCorners': 9, 'AwayCorners': 10, 'HomeGoals': 11, 'AwayGoals': 12, 'HomeXG': 13}
为homestats和awaystats制作映射字典({0:'HomeShotsFor',1:'HomeShotsAgainst'等}表示home_stats)。为了解释映射词典的工作原理,例如,如果您想获得游戏的HomeCrosses,可以使用
game[7]
或
game[strtonum['HomeCrosses']]
然后功能:
def calculate_home_stats(team, games):
home_stats=[0]*13
for game in games:
if game[1]=team:
for index in range(12):
home_stats[index]+=game[index+3] #because you just put the sum of everything except date, home, and away which are the first 3 indices. see how this cleans everything up?
home_stats[12]+=1
def calculate_away_stats(team, games):
away_stats=[0]*13
for game in games:
if game[2]=team:
for index in range(12):
away_stats[index]+=game[index+3]
away_stats[12]+=1
如果您真的想将两个功能合并为一个,则可以执行此操作:
def calculate_stats(team, games, homeaway):
stats=[0]*13
for game in games:
if game[{'Home': 1, 'Away': 2}[homeaway]]=team:
for index in range(12):
stats[index]+=game[index+3]
stats[12]+=1
与我的函数一样,您唯一需要更改的是检查原点或原点的索引,而不是需要大量更改的多余的if else语句。
例如,我不建议使用命名元组,而是带字典的简单元组: