我有一个数据框架,如下图所示
Doctor Start B_ID Session Finish NoShow
A 2020-01-18 12:00:00 1 S1 2020-01-18 12:33:00 no
A 2020-01-18 12:20:00 2 S1 2020-01-18 12:52:00 no
A 2020-01-18 13:00:00 3 S1 2020-01-18 13:23:00 no
A 2020-01-18 13:00:00 4 S1 2020-01-18 13:37:00 yes
A 2020-01-18 13:35:00 5 S1 2020-01-18 13:56:00 no
A 2020-01-18 14:10:00 6 S1 2020-01-18 14:15:00 no
A 2020-01-18 14:10:00 7 S1 2020-01-18 14:28:00 yes
A 2020-01-18 14:10:00 8 S1 2020-01-18 14:40:00 yes
A 2020-01-18 14:10:00 9 S1 2020-01-18 15:01:00 no
A 2020-01-19 12:00:00 12 S2 2020-01-19 12:20:00 no
A 2020-01-19 12:30:00 13 S2 2020-01-19 12:40:00 no
A 2020-01-19 13:00:00 14 S2 2020-01-19 13:20:00 yes
A 2020-01-19 13:40:00 15 S2 2020-01-19 13:46:00 no
A 2020-01-19 14:00:00 16 S2 2020-01-19 14:10:00 yes
A 2020-01-19 14:00:00 17 S2 2020-01-19 14:20:00 no
A 2020-01-19 14:00:00 19 S2 2020-01-19 14:40:00 yes
B 2020-01-18 12:00:00 21 S3 2020-01-18 12:33:00 no
B 2020-01-18 12:30:00 22 S3 2020-01-18 12:52:00 no
B 2020-01-18 13:10:00 23 S3 2020-01-18 13:25:00 no
B 2020-01-18 13:10:00 24 S3 2020-01-18 13:39:00 no
B 2020-01-18 13:30:00 25 S3 2020-01-18 13:56:00 yes
B 2020-01-18 14:05:00 26 S3 2020-01-18 14:15:00 no
B 2020-01-18 14:30:00 27 S3 2020-01-18 14:48:00 yes
根据以上内容,我想准备以下的数据框架。
预期产出。
Doctor Day No_of_slots No_of_bookings No_of_NoShow
A 2020-01-18 5 9 3
A 2020-01-19 5 7 3
b 2020-01-18 6 7 2
其中
No_of_slots = Total number of slots based on unique Start time
No_of_bookings = Total number of bookings
No_of_NoShow = Number of NoShow == 'yes'
使用 GroupBy.agg
与命名的聚合,对于计数 yes
价值是用来 sum
旁栏 new
所创 DataFrame.assign
比比 Series.eq
并将其转换为数字 Series.view
:
df['Start'] = pd.to_datetime(df['Start'])
df['Finish'] = pd.to_datetime(df['Finish'])
d = df['Start'].dt.date.rename('Day')
df1 = (df.assign(new = df['NoShow'].eq('yes').view('i1'))
.groupby(['Doctor', d]).agg(No_of_slots=('Start','nunique'),
No_of_bookings=('Start','size'),
No_of_NoShow=('new', 'sum'))
.reset_index())
print (df1)
Doctor Day No_of_slots No_of_bookings No_of_NoShow
0 A 2020-01-18 5 9 3
1 A 2020-01-19 5 7 3
2 B 2020-01-18 6 7 2