我的数据集是这样的:
[{'Date': '22-Aug-2019', 'Open': 10905.3, 'High': 10908.25, 'Low': 10718.3, 'Close': 10741.35, 'Shares Traded': 668193449, 'Turnover (Rs. Cr)': 18764.38},
{'Date': '23-Aug-2019', 'Open': 10699.6, 'High': 10862.55, 'Low': 10637.15, 'Close': 10829.35, 'Shares Traded': 667079625, 'Turnover (Rs. Cr)': 20983.75}, {'Date': '26-Aug-2019', 'Open': 11000.3, 'High': 11070.3, 'Low': 10756.55, 'Close': 11057.85, 'Shares Traded': 684141923, 'Turnover (Rs. Cr)': 22375.99}]
我希望将此数据集中的日平均值,最小值,最大值作为我的输出。
for share in dataset:
day_name = datetime.datetime.strptime(share['Date'], "%d-%b-%Y").strftime('%A')
if day_name not in day_wise.keys():
day_wise[day_name] = {'avg':0, 'min':9999999999, 'max':0}
if share['Turnover (Rs. Cr)'] > day_wise[day_name]['max']:
day_wise[day_name]['max'] = share['Turnover (Rs. Cr)']
if share['Turnover (Rs. Cr)'] < day_wise[day_name]['min']:
day_wise[day_name]['min'] = share['Turnover (Rs. Cr)']
day_wise[day_name]['avg'] += share['Turnover (Rs. Cr)']
else:
if share['Turnover (Rs. Cr)'] > day_wise[day_name]['max']:
day_wise[day_name]['max'] = share['Turnover (Rs. Cr)']
if share['Turnover (Rs. Cr)'] < day_wise[day_name]['min']:
day_wise[day_name]['min'] = share['Turnover (Rs. Cr)']
day_wise[day_name]['avg'] += share['Turnover (Rs. Cr)']
return Response(day_wise)
但是我想对其进行优化,就像更少的代码行数和更快的性能一样。
如果您可以使用pandas,请尝试此。
将字典加载到熊猫数据框,在groupby上应用Date
并在数字列上聚合,然后使用to_dict将数据框转换回字典。
import pandas as pd
>>> df = pd.DataFrame(data)
Date Open High Low Close Shares Traded Turnover (Rs. Cr)
0 22-Aug-2019 10905.3 10908.25 10718.30 10741.35 668193449 18764.38
1 23-Aug-2019 10699.6 10862.55 10637.15 10829.35 667079625 20983.75
2 26-Aug-2019 11000.3 11070.30 10756.55 11057.85 684141923 22375.99
>>> df['Day'] = pd.to_datetime(df['Date'], format="%d-%b-%Y").dt.strftime("%A")
>>> df_g = df.groupby('Day')['Turnover (Rs. Cr)'].agg(['min','max','mean'])
min max mean
Day
Friday 20983.75 20983.75 20983.75
Monday 22375.99 22375.99 22375.99
Thursday 18764.38 18764.38 18764.38
>>> df_g.to_dict(orient='index')
{'Friday': {'max': 20983.75, 'mean': 20983.75, 'min': 20983.75},
'Monday': {'max': 22375.99, 'mean': 22375.99, 'min': 22375.99},
'Thursday': {'max': 18764.38, 'mean': 18764.38, 'min': 18764.38}}
一种减少代码的简单而优雅的方法是使用DataFrame。像这样:
import pandas as pd
dataset = pd.DataFrame(dataset)
# Select just the year like you did, to use as a key in
# day_wise dict:
data["Date"] = data["Date"].apply(lambda x: x.split('-')[-1])
day_wise = {}
for name, group in data.groupby('Date'):
day_wise[name] = {
"avg": group["Turnover (Rs. Cr)"].sum(),
"min": group["Turnover (Rs. Cr)"].min(),
"max": group["Turnover (Rs. Cr)"].max()
}
>> day_wise
{'2019': {'avg': 62124.12000000001, 'min': 18764.38, 'max': 22375.99}}