汇总有关日期的数据

问题描述 投票:0回答:1

我有这样的数据帧:

  Customer Id Start Date  End Date     Count  
  1403120020  2014-03-13  2014-03-17   38.0 
  1403120020  2014-03-18  2014-04-16  283.0
  1403120020  2014-04-17  2014-04-25  100.0 
  1403120020  2014-04-26  2014-05-15  50.0  
  1812040169  2018-12-07  2018-12-19  122.0
  1812040169  2018-12-19  2018-12-20   10.0  
  1812040169  2018-12-21  2019-01-18  365.0  

对于单个客户,我在特定月份内有多个开始日期,而该月份的结束日期之一位于下个月。我希望以下列方式为客户提供一个开始日期和一个结束日期,并计算总数:

  Customer Id Start Date  End Date     Count   
  1403120020  2014-03-13  2014-04-16   321
  1403120020  2014-04-17  2014-05-15  150.0  
  1812040169  2018-12-07  2019-1-18    497 
python-3.x pandas dataframe
1个回答
3
投票

使用groupby.agg

df = (df.groupby('Customer_Id').agg({'Start_Date':'first', 'End_Date':'last', 'Count':'sum'})
        .reset_index())

print(df)
   Customer_Id  Start_Date    End_Date  Count
0   1403120020  2014-03-13  2014-04-16  321.0
1   1812040169  2018-12-07  2019-01-18  497.0

编辑:

df['grp'] = df['Start_Date'].dt.month
df = (df.groupby(['Customer_Id','grp'])
        .agg({'Start_Date':'first', 'End_Date':'last', 'Count':'sum'})
        .reset_index().drop('grp', axis=1))

print(df)
   Customer_Id Start_Date    End_Date  Count
0   1403120020 2014-03-13  2014-04-16  321.0
1   1403120020 2014-04-17  2014-05-15  150.0
2   1812040169 2018-12-07  2019-01-18  497.0
© www.soinside.com 2019 - 2024. All rights reserved.