基于熊猫的分钟级别的基于日期时间列的分组依据

问题描述 投票:1回答:1

我有一个数据框,如下所示。

Doctor       Appointment           Booking_ID   
  A          2020-01-18 12:00:00     1 
  A          2020-01-18 12:30:00     2
  A          2020-01-18 13:00:00     3 
  A          2020-01-18 13:00:00     4 
  B          2020-01-18 12:00:00     5 
  B          2020-01-18 12:30:00     6 
  B          2020-01-18 13:00:00     7
  B          2020-01-18 13:00:00     8 
  B          2020-01-18 13:00:00     9 
  B          2020-01-18 16:30:00     10 
  A          2020-01-19 12:00:00     11 
  A          2020-01-19 12:30:00     12 
  A          2020-01-19 13:00:00     13
  A          2020-01-19 13:30:00     14
  A          2020-01-19 14:00:00     15 
  A          2020-01-19 14:00:00     16 
  A          2020-01-19 14:00:00     17 
  A          2020-01-19 14:00:00     18 
  B          2020-01-19 12:00:00     19 
  B          2020-01-19 12:30:00     20
  B          2020-01-19 13:00:00     21
  B          2020-01-19 13:30:00     22 
  B          2020-01-19 14:00:00     23
  B          2020-01-19 13:30:00     24 
  B          2020-01-19 15:00:00     25 
  B          2020-01-18 15:30:00     26

从上面,我想找出同一位医生同一时间的预约数。

预期输出:

    Doctor           Appointment     Booking_ID   Number_of_Booking
      A          2020-01-18 12:00:00     1         1
      A          2020-01-18 12:30:00     2         1
      A          2020-01-18 13:00:00     3         2
      A          2020-01-18 13:00:00     4         2
      B          2020-01-18 12:00:00     5         1
      B          2020-01-18 12:30:00     6         1
      B          2020-01-18 13:00:00     7         3
      B          2020-01-18 13:00:00     8         3
      B          2020-01-18 13:00:00     9         3
      B          2020-01-18 16:30:00     10        1
      A          2020-01-19 12:00:00     11        1
      A          2020-01-19 12:30:00     12        1
      A          2020-01-19 13:00:00     13        1
      A          2020-01-19 13:30:00     14        1
      A          2020-01-19 14:00:00     15        4
      A          2020-01-19 14:00:00     16        4
      A          2020-01-19 14:00:00     17        4
      A          2020-01-19 14:00:00     18        4
      B          2020-01-19 12:00:00     19        1
      B          2020-01-19 12:30:00     20        1 
      B          2020-01-19 13:00:00     21        1
      B          2020-01-19 13:30:00     22        2
      B          2020-01-19 14:00:00     23        2
      B          2020-01-19 13:30:00     24        2 
      B          2020-01-19 14:00:00     25        2
      B          2020-01-18 15:30:00     26        1

示例:

在时间2020-01-19 13:30:00 B进行了两次预订,如下所示

Doctor       Appointment           Booking_ID
B          2020-01-19 13:30:00     22
B          2020-01-19 13:30:00     24 

因此输出将如下所示

 Doctor       Appointment           Booking_ID     Number_of_Booking
    B        2020-01-19 13:30:00     22             2
    B        2020-01-19 13:30:00     24             2
pandas pandas-groupby
1个回答
2
投票
首先与GroupBy.transform一起使用GroupBy.transform

GroupBy.size

对于第二个,如果在所有数据中GroupBy.sizedf['Number_of_Booking']=df.groupby(['Doctor','Appointment'])['Booking_ID'].transform('size')

print (df.head())
  Doctor          Appointment  Booking_ID  Number_of_Booking
0      A  2020-01-18 12:00:00           1                  1
1      A  2020-01-18 12:30:00           2                  1
2      A  2020-01-18 13:00:00           3                  2
3      A  2020-01-18 13:00:00           4                  2
4      B  2020-01-18 12:00:00           5                  1
的唯一组合,如在样本中,则分配Doctor的长度:

Appointment

© www.soinside.com 2019 - 2024. All rights reserved.