pandas 分组同时添加和平均[重复]

Question

我有一个数据框，其中包含进程列表以及它们花费的时间，如下

我想要得到以下结果

我知道如何使用 gorupby 来获得一列，但只有其中一列。这就是我解决问题的方法

# the data
ps    = ['p1','p2','p3','p4','p2','p2','p3','p6','p2','p4','p5','p6']
times = [20,10,2,3,4,5,6,3,4,5,6,7]
processes = pd.DataFrame({'ps':ps,'time':times})

# the series
dfsum   = processes.groupby('PROCESS')['TIME'].sum()
dfcount = processes.groupby('PROCESS')['TIME'].count()

# "building" the df result
frame = { 'total_time': dfsum, 'total_nr': dfcount} 
dfresult = pd.DataFrame(frame)
dfresult['average']= dfresult['total_time']/dfresult['total_nr']
dfresult

但是如何获得所需的 df 而不必逐列组合呢？对我来说，这个方法不够“pandonic”（也不是Pythonic）

Answer 1

processes.groupby('ps').agg(TOTAL_TIME=('time','sum'),AVERAGE=('time','mean'),NRTIMES=('time','size'))

Answer 2

尝试

groupby.agg()

：

df.groupby('PROCESS')['TIME'].agg(['sum','mean','count'])

样本数据的输出：

    sum   mean  count
ps                   
p1   20  20.00      1
p2   23   5.75      4
p3    8   4.00      2
p4    8   4.00      2
p5    6   6.00      1
p6   10   5.00      2

pandas 分组同时添加和平均[重复]

问题描述投票：0回答：2

2个回答

最新问题

pandas 分组同时添加和平均[重复]

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2