我有数据
customer_id purchase_amount date_of_purchase
0 760 25.0 06-11-2009
1 860 50.0 09-28-2012
2 1200 100.0 10-25-2005
3 1420 50.0 09-07-2009
4 1940 70.0 01-25-2013
5 1960 40.0 10-29-2013
6 2620 30.0 09-03-2006
7 3050 50.0 12-04-2007
8 3120 150.0 08-11-2006
9 3260 45.0 10-20-2010
10 3510 35.0 04-05-2013
11 3970 30.0 07-06-2007
12 4000 20.0 11-25-2005
13 4180 20.0 09-22-2010
14 4390 30.0 04-15-2011
15 4750 60.0 02-12-2013
16 4840 30.0 10-14-2005
17 4910 15.0 12-13-2006
18 4950 50.0 05-19-2010
19 4970 30.0 01-12-2006
20 5250 50.0 12-20-2005
现在我想从01-01-2016
的每一行中减去date_of_purchase
我尝试了以下操作,因此我应该在新的days_since
栏中添加几天的时间。
NOW = pd.to_datetime('01/01/2016').strftime('%m-%d-%Y')
gb = customer_purchases_df.groupby('customer_id')
df2 = gb.agg({'date_of_purchase': lambda x: (NOW - x.max()).days})
任何建议。我怎样才能做到这一点
预先感谢
pd.to_datetime(df['date_of_purchase']).sub(pd.to_datetime('2016-01-01')).dt.days.mul(-1)
0 2395
1 1190
2 3720
3 2307
4 1071
5 794
6 3407
7 2950
8 3430
9 1899
10 1001
11 3101
12 3689
13 1927
14 1722
15 1053
16 3731
17 3306
18 2053
19 3641
20 3664
Name: date_of_purchase, dtype: int64
'date_of_purchase'
列已经具有datetime dtype。>>> df
customer_id purchase_amount date_of_purchase
0 760 25.0 2009-06-11
1 860 50.0 2012-09-28
2 1200 100.0 2005-10-25
>>> df['days_since'] = df['date_of_purchase'].sub(pd.to_datetime('01/01/2016')).dt.days.abs()
>>> df
customer_id purchase_amount date_of_purchase days_since
0 760 25.0 2009-06-11 2395
1 860 50.0 2012-09-28 1190
2 1200 100.0 2005-10-25 3720