获取python中每个客户的每个订单行之前的最后一个订单ID

问题描述 投票:2回答:2

我有一个表:customer_ids,order_ids,product_id和order_dates,我想在我的表中添加一列,其中包含购买此产品的每个客户的最后订单日期(在python中)。

 customerid         orderid     productid    orderdate    
 -----------------------------------------------------    
 1                  1           1            2018/01/01    
 1                  1           2            2018/01/01
 1                  2           3            2018/01/04
 1                  3           1            2018/01/10
 2                  5           1            2018/01/14
 1                  7           3            2018/01/17
 2                  12          2            2018/01/12
 1                  20          1            2018/01/23

我想要一个这样的表:

 customerid         orderid     productid    orderdate    lastorderdate    
 ----------------------------------------------------------------------    
 1                  1           1            2018/01/01    NA    
 1                  1           2            2018/01/01    NA        
 1                  2           3            2018/01/04    NA    
 1                  3           1            2018/01/10    2018/01/01    
 2                  5           1            2018/01/14    NA    
 1                  7           3            2018/01/17    2018/01/04    
 2                  12          2            2018/01/12    NA    
 2                  20          1            2018/01/23    2018/01/14   

我该怎么办?

python pandas
2个回答
1
投票

IIUC,您可以使用:

df=df.sort_values(['customerid','productid'])
df['last_order']=df.groupby(['productid','customerid'])['orderdate'].\
                                            apply(lambda x: x.shift())
print(df)

输出是:

   customerid  orderid  productid  orderdate last_order
0           1        1          1 2018-01-01        NaT
3           1        3          1 2018-01-10 2018-01-01
7           1       20          1 2018-01-23 2018-01-10
1           1        1          2 2018-01-01        NaT
2           1        2          3 2018-01-04        NaT
5           1        7          3 2018-01-17 2018-01-04
4           2        5          1 2018-01-14        NaT
6           2       12          2 2018-01-12        NaT

您也可以使用df = df.sort_index()将索引与原始对齐。

根据您的数据输出:

df=df.sort_values(['customer_id','product_id'])
df['last_order']=df.groupby(['product_id','customer_id'])['date'].\
                                            apply(lambda x: x.shift())
print(df.sort_index().head(20))

     row_id       date  customer_id  product_id last_order
0        1 2018-04-07            4           1        NaT
1        2 2018-04-07            4           1 2018-04-07
2        3 2018-04-07            4           1 2018-04-07
3        4 2018-04-07            4           1 2018-04-07
4        5 2018-04-07            4           1 2018-04-07
5        6 2018-04-07            4           1 2018-04-07
6        7 2018-04-07            4           1 2018-04-07
7        8 2018-04-07            4           1 2018-04-07
8       13 2018-04-09            4           1 2018-04-07
9       49 2018-04-13            4           1 2018-04-09
10     106 2018-04-20            4           1 2018-04-13
11     115 2018-04-20            4           1 2018-04-20
12     142 2018-04-27            4           2        NaT
13     143 2018-04-27            4           2 2018-04-27
14     149 2018-04-29            4           2 2018-04-27
15     168 2018-05-02            4           1 2018-04-20
16     169 2018-05-02            4           1 2018-05-02
17     229 2018-05-08            4           5        NaT
18     230 2018-05-08            4           5 2018-05-08
19     231 2018-05-08            4           5 2018-05-08

0
投票
row_id  date    customer_id product_id
1   4/7/2018    4   1
2   4/7/2018    4   1
3   4/7/2018    4   1
4   4/7/2018    4   1
5   4/7/2018    4   1
6   4/7/2018    4   1
7   4/7/2018    4   1
8   4/7/2018    4   1
13  4/9/2018    4   1
49  4/13/2018   4   1
106 4/20/2018   4   1
115 4/20/2018   4   1
142 4/27/2018   4   2
143 4/27/2018   4   2
149 4/29/2018   4   2
168 5/2/2018    4   1
169 5/2/2018    4   1
229 5/8/2018    4   5
230 5/8/2018    4   5
231 5/8/2018    4   5
233 5/9/2018    4   1
237 5/9/2018    4   5
238 5/9/2018    4   5
239 5/9/2018    4   5
240 5/9/2018    4   5
241 5/9/2018    4   5
255 5/14/2018   4   5
256 5/14/2018   4   5
257 5/14/2018   4   5
258 5/14/2018   4   5
259 5/14/2018   4   5
268 5/15/2018   4   5
278 5/17/2018   4   3
293 5/19/2018   4   5
294 5/19/2018   4   5
295 5/19/2018   4   5
296 5/19/2018   4   5
298 5/20/2018   4   5
370 5/21/2018   4   5
371 5/21/2018   4   5
401 5/26/2018   4   2
416 5/30/2018   4   5
417 5/30/2018   4   5
418 5/30/2018   4   5
445 5/31/2018   4   1
446 5/31/2018   4   1
447 5/31/2018   4   1
448 5/31/2018   4   1
449 5/31/2018   4   1
51767   6/13/2018   4   2
51768   6/13/2018   4   2
51769   6/13/2018   4   2
51770   6/13/2018   4   2
51771   6/13/2018   4   2
51772   6/13/2018   4   2
53245   6/19/2018   4   1
53247   6/19/2018   4   1
54773   7/25/2018   4   1
54837   7/26/2018   4   5
54838   7/26/2018   4   5
54891   7/27/2018   4   1
54920   7/28/2018   4   5
54922   7/28/2018   4   5
54979   7/29/2018   4   5
54980   7/29/2018   4   5
54981   7/29/2018   4   5
54982   7/29/2018   4   5
54983   7/29/2018   4   5
54984   7/29/2018   4   5
54985   7/29/2018   4   5
55039   7/30/2018   4   5
55040   7/30/2018   4   5
55041   7/30/2018   4   5
55042   7/30/2018   4   5
55043   7/30/2018   4   5
55044   7/30/2018   4   5
55045   7/30/2018   4   5
55046   7/30/2018   4   5
55537   8/5/2018    4   5
55640   8/6/2018    4   5
55653   8/6/2018    4   5
55654   8/6/2018    4   5
55655   8/6/2018    4   5
55656   8/6/2018    4   5
55658   8/6/2018    4   5
55853   8/8/2018    4   5
55854   8/8/2018    4   5
55855   8/8/2018    4   5
55856   8/8/2018    4   5
55857   8/8/2018    4   5
55858   8/8/2018    4   5
55859   8/8/2018    4   5
55860   8/8/2018    4   5
56011   8/11/2018   4   5
© www.soinside.com 2019 - 2024. All rights reserved.