我有一个 pandas 数据透视表,我想更改 bin 范围以从 0 开始计算。
Hour_Num (0, 12] (12, 15] (15, 20] (20, 24] today_qty yesterday_qty today_qty yesterday_qty today_qty yesterday_qty today_qty yesterday_qty channel_name Ajio 22 68 0 55 0 53 0 32 Amazon 3 6 0 3 0 3 0 0 D2C 0 0 0 1 0 0 0 0 Flipkart 25 32 0 18 0 42 0 26 Limeroad 1 0 0 0 0 1 0 0 Meesho 3 7 0 3 0 1 0 0 Myntra 61 102 0 53 0 96 0 55 Nykaa 12 8 0 10 0 14 0 18 Snapdeal 0 0 0 0 0 0 0 1 TataCliq 3 9 0 2 0 5 0 5
我希望垃圾箱为 (0, 12] (0, 15] (0, 20] (0, 24])。我想显示从当天开始到中午 12 点、下午 3 点、晚上 8 点和午夜12点。
12: 12 点 15: 3 下午 20: 8 下午 24:午夜 12 点
这是我的代码:
df['Hour_Num'] = pd.cut(df.order_hour,[0,12,15,20,24])
pivot_df = df.pivot_table(index='channel_name', values=(['yesterday_qty','today_qty']), columns=['Hour_Num'], aggfunc=('sum')).fillna(0)
pivot_df = pivot_df.swaplevel(0,1, axis=1).sort_index(axis=1)
我很感激任何提示或解决方案。谢谢你。
这样的东西是您正在寻找的吗?
import pandas as pd
import numpy as np
time_periods = ('(0, 12]', '(12, 15]', '(15, 20]', '(20, 24]')
quantities_days = ('today_qty', 'yesterday_qty')
columns_names = pd.MultiIndex.from_product((time_periods, quantities_days), names=('Hour_Num', 'Qty'))
channel_index = pd.Index(
data=('Ajio', 'Amazon', 'D2C', 'Flipkart', 'Limeroad', 'Meesho', 'Myntra', 'Nykaa', 'Snapdeal', 'TataCliq'),
name='channel_index'
)
values = np.array(
[
[22, 68, 0, 55, 0, 53, 0, 32],
[ 3, 6, 0, 3, 0, 3, 0, 0],
[ 0, 0, 0, 1, 0, 0, 0, 0],
[25, 32, 0, 15, 0, 42, 0, 26],
[ 1, 0, 0, 0, 0, 1, 0, 0],
[ 3, 7, 0, 3, 0, 1, 0, 0],
[61, 102, 0, 53, 0, 96, 0, 55],
[12, 8, 0, 10, 0, 14, 0, 18],
[ 0, 0, 0, 0, 0, 0, 0, 1],
[ 3, 0, 0, 2, 0, 5, 0, 5]
]
)
pivot_df = pd.DataFrame(values, index=channel_index, columns=columns_names)
print('Starting point:')
print(pivot_df)
print("\n\n")
########
time_periods_renamed = {
'(12, 15]' : '(0, 15]',
'(15, 20]' : '(0, 20]',
'(20, 24]' : '(0, 24]'
}
df = pivot_df.T.groupby(level=1).cumsum().T.rename(columns=time_periods_renamed)
print('Result')
print(df)
我在这里所做的是: