基于Python中列的滚动Groupby函数

问题描述 投票:0回答:2

我有按ID,时间戳排序的数据:

1   2016-05-13 22:21:02.625+00  Deposit 
1   2016-05-13 22:29:26.402+00  Deposit 
1   2016-05-16 00:51:22.835+00  Withdrawal  
1   2019-03-21 21:01:28.528+00  Withdrawal  
1   2019-03-21 21:02:10.509+00  Withdrawal  
1   2019-12-06 23:34:15.194+00  Deposit 
2   2019-12-03 23:21:33.465+00  Withdrawal  
2   2019-12-05 00:51:01.136+00  Deposit 
2   2019-12-06 20:07:11.122+00  Deposit 

欲望输出低于这是刚刚由存款/取款分组但停止每次它改变从存款取款或反之亦然:

1   2016-05-13 22:29:26.402+00  Deposit 2
1   2019-03-21 21:02:10.509+00  Withdrawal 3
1   2019-12-06 23:34:15.194+00  Deposit 1
2   2019-12-03 23:21:33.465+00  Withdrawal 1
2   2019-12-06 20:07:11.122+00  Deposit 2

是否有干净的方法可以做到这一点?

python
2个回答
0
投票

您可以使用itertools.groupbydoc)根据ID和类型对数据进行分组。

例如:

data = [
['1', '2016-05-13 22:21:02.625+00', 'Deposit'],
['1', '2016-05-13 22:29:26.402+00', 'Deposit'],
['1', '2016-05-16 00:51:22.835+00', 'Withdrawal'],
['1', '2019-03-21 21:01:28.528+00', 'Withdrawal'],
['1', '2019-03-21 21:02:10.509+00', 'Withdrawal'],
['1', '2019-12-06 23:34:15.194+00', 'Deposit'],
['2', '2019-12-03 23:21:33.465+00', 'Withdrawal'],
['2', '2019-12-05 00:51:01.136+00', 'Deposit'],
['2', '2019-12-06 20:07:11.122+00', 'Deposit']
]

from itertools import groupby

data = sorted(data, key=lambda k: (int(k[0]), k[1]))    # <-- if your data is sorted by (ID, Time), you may skip this

for v, g in groupby(data, lambda k: (k[0], k[2])):
    g = list(g)
    print(v[0], g[-1][1], g[-1][2], len(g))

打印:

1 2016-05-13 22:29:26.402+00 Deposit 2
1 2019-03-21 21:02:10.509+00 Withdrawal 3
1 2019-12-06 23:34:15.194+00 Deposit 1
2 2019-12-03 23:21:33.465+00 Withdrawal 1
2 2019-12-06 20:07:11.122+00 Deposit 2

注:您可以按时间作为字符串。它的格式允许这样做。


0
投票

可以groupbyagg取回pd.DataFrame

(df.groupby([df.index, df['dw'].ne(df['dw'].shift()).cumsum()], as_index=False)
    .agg({'timestamp': 'first', 'dw': ['first', 'count']}))

                    timestamp          dw      
                        first       first count
0  2016-05-13 22:21:02.625+00     Deposit     2
1  2016-05-16 00:51:22.835+00  Withdrawal     3
2  2019-12-06 23:34:15.194+00     Deposit     1
3  2019-12-03 23:21:33.465+00  Withdrawal     1
4  2019-12-05 00:51:01.136+00     Deposit     2
© www.soinside.com 2019 - 2024. All rights reserved.