我有以下数据:
inputdata = [[1,'long',30.2,'Win'],[1,'long',-12.4,'Loss'],
[2,'short',-12.3,'Loss'],[1,'long',3.2,'Win'],
[3,'short',0.0,'B/E'],[3,'short',23.2,'Win'],
[3,'long',3.2,'Win'],[4,'short',-4.2,'Loss']]
datadf = DataFrame(columns=['AssetId','Direction','PnL','W_L'], data = inputdata)
datadf
AssetId Direction PnL W_L
0 1 long 30.2 Win
1 1 long -12.4 Loss
2 2 short -12.3 Loss
3 1 long 3.2 Win
4 3 short 0.0 B/E
5 3 short 23.2 Win
6 3 long 3.2 Win
7 4 short -4.2 Loss
现在,我想将其进一步聚合到一个看起来像这样的新数据框中(添加了一些示例行,要添加更多统计信息:
Stat Long Short Total
0 Trades 4 4 8
1 Won 3 1 4
2 Lost 1 2 3
(...)
我尝试过:
datadf.groupby(['Direction'])['PnL'].count()
Direction
long 4
short 4
Name: PnL, dtype: int64
这会产生必要的数据,但是我必须逐个字段填充聚合数据帧,这看起来很麻烦,我什至不确定如何将确切的值添加到每行/列中。根据此示例,是否有更好的方法实现此目标?
使用pivot_table
:
res = pd.pivot_table(df.iloc[:,1:], index=["W_L"], columns=["Direction"], aggfunc="count").droplevel(0, 1)
res["total"] = res.sum(1)
print (res.append(res.sum().rename(index="Trades")))
Direction long short total
W_L
B/E NaN 1.0 1.0
Loss 1.0 2.0 3.0
Win 3.0 1.0 4.0
Trades 4.0 4.0 8.0
您可以进行crosstab
:
pd.crosstab(df['W_L'], df['Direction'],margins=True, margins_name='Total')
输出:
Direction long short Total
W_L
B/E 0 1 1
Loss 1 2 3
Win 3 1 4
Total 4 4 8