按显示分组,计算相应的VideoID,对观看次数求和,对收入求和。我该如何实现?
我的桌子看起来像
show. views. revenue VideoID
batman. 100. 10. v1
batman. 200. 20. v2
joker. 100. 10. v3
joker. 300. 15. v4
superman. 500. 30. v5
我的预期输出是
Show total_views total_revenue. video_count
batman. 300. 30. 2
joker 400. 25 2
superman 500 30 1
我该如何实现?
这是我到目前为止尝试过的,但是输出错误
def grouping_series(df_series):
t = defaultdict(list)
gp = df_series.groupby('show')
for i, k in gp:
t['total_views'].append(k['views'].sum())
t['total_revenue'].append(k['revenue'].sum())
t['video_count'].append(k['VideoID'].count())
return pd.DataFrame(t)
df = grouping_series(df_series)
我们通常会做agg
s=df.groupby('show').agg(total_views=('views', 'sum'),
total_revenue=('revenue', 'sum'),
video_count=('VideoID', 'count')).reset_index()
show total_views total_revenue video_count
0 batman. 300.0 30.0 2
1 joker. 400.0 25.0 2
2 superman. 500.0 30.0 1
这是我的建议:
iimport pandas as pd
frame = {
"show": ["batman", "batman", "joker", "joker", "superman"],
"views": [100, 200, 100, 300, 500],
"revenue": [10, 20, 10, 15, 30],
"VideoID": ["v1", "v2", "v3", "v4", "v5"],
}
df = pd.DataFrame(frame)
aggretations = {"views": "sum", "revenue": "sum", "VideoID": "nunique"}
df.groupby(["show"]).agg(aggretations)