(重复)Margins=True 没有很好地总结 Pandas Pivot_Table

问题描述 投票:0回答:1

对于以下代码片段和结果,由于

Margins=True
给出了一行的两个值的平均值,因此计算错误。有什么解释吗?

# Print the mean weekly_sales by department and type; fill missing values with 0s; sum all rows and cols

print(sales.pivot_table(values="weekly_sales", index="department", columns="type", fill_value=0,margins=True))

结果是:

type                A           B        All
department                                  
1           30961.725   44050.627  32052.467
2           67600.159  112958.527  71380.023
3           17160.003   30580.655  18278.391
4           44285.399   51219.654  44863.254
5           34821.011   63236.875  37189.000
...               ...         ...        ...
96          21367.043    9528.538  20337.608
97          28471.267    5828.873  26584.401
98          12875.423     217.428  11820.590
99            379.124       0.000    379.124
All         23674.667   25696.678  23843.950

我搜索了整个互联网,发现这个网站上有一个重复的主题。

python pandas
1个回答
0
投票

这取决于什么需要:

#sample data
print (sales)
   weekly_sales  department type
0             4           1    A
1             7           2    A
2             4           2    B
3             3           1    B
4             1           1    A
5             2           2    A
6             3           2    B
7             5           1    B

默认聚合 - 平均值

print(sales.pivot_table(values="weekly_sales", 
                        index="department", 
                        columns="type", 
                        fill_value=0,
                        margins=True))
type          A     B    All
department                  
1           2.5  4.00  3.250
2           4.5  3.50  4.000
All         3.5  3.75  3.625

聚合 - 平均值

print(sales.pivot_table(values="weekly_sales", 
                        index="department",
                        columns="type",
                        aggfunc='mean', 
                        fill_value=0,
                        margins=True))
type          A     B    All
department                  
1           2.5  4.00  3.250
2           4.5  3.50  4.000
All         3.5  3.75  3.625

聚合 - 总和

print(sales.pivot_table(values="weekly_sales", 
                        index="department", 
                        columns="type", 
                        aggfunc='sum', 
                        fill_value=0,
                        margins=True))
type         A   B  All
department             
1            5   8   13
2            9   7   16
All         14  15   29

聚合平均值,所有列均按总和填充

out = sales.pivot_table(values="weekly_sales", 
                        index="department", 
                        columns="type", 
                        aggfunc='mean', 
                        fill_value=0)
out['All'] = out.sum(axis=1)
out.loc['All'] = out.sum()
print (out)
type          A    B   All
department                
1           2.5  4.0   6.5
2           4.5  3.5   8.0
All         7.0  7.5  14.5
© www.soinside.com 2019 - 2024. All rights reserved.