我需要一个 SQL 查询来对客户首次交易后 30 天内进行的交易进行分组。 将客户首次交易后 30 天内进行的所有连续交易视为一个组。如果在前一组的第一笔交易之后超过 30 天进行连续交易,则会开始新的组。
数据:
trans_Id | 客户 ID | trans_日期 |
---|---|---|
001 | 1101 | 2020-11-02 |
002 | 1101 | 2020-11-14 |
003 | 1101 | 2020-11-18 |
004 | 1101 | 2021-12-04 |
005 | 1101 | 2021-12-05 |
006 | 1101 | 2021-12-08 |
007 | 1101 | 2021-01-17 |
008 | 1101 | 2021-05-01 |
009 | 1101 | 2021-05-04 |
010 | 1102 | 2021-03-02 |
011 | 1102 | 2021-03-08 |
012 | 1102 | 2021-04-01 |
013 | 1102 | 2021-04-02 |
014 | 1102 | 2021-04-12 |
015 | 1102 | 2021-04-29 |
016 | 1102 | 2021-06-10 |
017 | 1102 | 2021-06-12 |
预期结果(预期如下分组)。
客户 1101 的交易 002 和 003 被分组为 1,因为它们发生在第一笔交易 001 的 30 天内。交易 004 超出了组 1 中第一笔交易的 30 天窗口,因此它启动了一个新组 group 2. 交易 005 和 006 属于交易 004 的 30 天窗口内,因此将它们分为 2。也就是说,每组开始和结束交易的时间差应该是 30 天。
trans_Id | 客户 ID | trans_日期 | 团体 |
---|---|---|---|
001 | 1101 | 2020-11-02 | 1 |
002 | 1101 | 2020-11-14 | 1 |
003 | 1101 | 2020-11-18 | 1 |
004 | 1101 | 2021-12-04 | 2 |
005 | 1101 | 2021-12-05 | 2 |
006 | 1101 | 2021-12-08 | 2 |
007 | 1101 | 2021-01-17 | 3 |
008 | 1101 | 2021-05-01 | 4 |
009 | 1101 | 2021-05-04 | 4 |
010 | 1102 | 2021-03-02 | 1 |
011 | 1102 | 2021-03-08 | 1 |
012 | 1102 | 2021-04-01 | 2 |
013 | 1102 | 2021-04-02 | 2 |
014 | 1102 | 2021-04-12 | 2 |
015 | 1102 | 2021-04-29 | 2 |
016 | 1102 | 2021-06-10 | 3 |
017 | 1102 | 2021-06-12 | 3 |
预期结果(预期如下分组)。
客户 1101 的交易 002 和 003 被分组为 1,因为它们发生在第一笔交易 001 的 30 天内。交易 004 超出了组 1 中第一笔交易的 30 天窗口,因此它启动了一个新组 group 2. 交易 005 和 006 属于交易 004 的 30 天窗口内,因此将它们分为 2。也就是说,每组开始和结束交易的时间差应该是 30 天。
trans_Id | 客户 ID | trans_日期 | 团体 |
---|---|---|---|
001 | 1101 | 2020-11-02 | 1 |
002 | 1101 | 2020-11-14 | 1 |
003 | 1101 | 2020-11-18 | 1 |
004 | 1101 | 2021-12-04 | 2 |
005 | 1101 | 2021-12-05 | 2 |
006 | 1101 | 2021-12-08 | 2 |
007 | 1101 | 2021-01-17 | 3 |
008 | 1101 | 2021-05-01 | 4 |
009 | 1101 | 2021-05-04 | 4 |
010 | 1102 | 2021-03-02 | 1 |
011 | 1102 | 2021-03-08 | 1 |
012 | 1102 | 2021-04-01 | 2 |
013 | 1102 | 2021-04-02 | 2 |
014 | 1102 | 2021-04-12 | 2 |
015 | 1102 | 2021-04-29 | 2 |
016 | 1102 | 2021-06-10 | 3 |
017 | 1102 | 2021-06-12 | 3 |
Other Way(ORACLE):
with jobdata as (
select '001' as trans_Id, '1101' as customer_id, to_date('2020-11-02', 'YYYY-MM-DD') as trans_date from dual union all
select '002' as trans_Id, '1101' as customer_id, to_date('2020-11-14', 'YYYY-MM-DD') as trans_date from dual union all
select '003' as trans_Id, '1101' as customer_id, to_date('2020-11-18', 'YYYY-MM-DD') as trans_date from dual union all
select '004' as trans_Id, '1101' as customer_id, to_date('2021-12-04', 'YYYY-MM-DD') as trans_date from dual union all
select '005' as trans_Id, '1101' as customer_id, to_date('2021-12-05', 'YYYY-MM-DD') as trans_date from dual union all
select '006' as trans_Id, '1101' as customer_id, to_date('2021-12-08', 'YYYY-MM-DD') as trans_date from dual union all
select '007' as trans_Id, '1101' as customer_id, to_date('2021-01-17', 'YYYY-MM-DD') as trans_date from dual union all
select '008' as trans_Id, '1101' as customer_id, to_date('2021-05-01', 'YYYY-MM-DD') as trans_date from dual union all
select '009' as trans_Id, '1101' as customer_id, to_date('2021-05-04', 'YYYY-MM-DD') as trans_date from dual union all
select '010' as trans_Id, '1102' as customer_id, to_date('2021-03-02', 'YYYY-MM-DD') as trans_date from dual union all
select '011' as trans_Id, '1102' as customer_id, to_date('2021-03-08', 'YYYY-MM-DD') as trans_date from dual union all
select '012' as trans_Id, '1102' as customer_id, to_date('2021-04-01', 'YYYY-MM-DD') as trans_date from dual union all
select '013' as trans_Id, '1102' as customer_id, to_date('2021-04-02', 'YYYY-MM-DD') as trans_date from dual union all
select '014' as trans_Id, '1102' as customer_id, to_date('2021-04-12', 'YYYY-MM-DD') as trans_date from dual union all
select '015' as trans_Id, '1102' as customer_id, to_date('2021-04-29', 'YYYY-MM-DD') as trans_date from dual union all
select '016' as trans_Id, '1102' as customer_id, to_date('2021-06-10', 'YYYY-MM-DD') as trans_date from dual union all
select '017' as trans_Id, '1102' as customer_id, to_date('2021-06-12', 'YYYY-MM-DD') as trans_date from dual
)
select a.trans_Id, a.customer_id, a.trans_date
, dense_rank() over (partition by a.customer_id order by gap_group) as group_id
from (
select a.*, min(a.trans_date) over (partition by a.customer_id order by customer_id, trans_date) as min_trans_date
, a.trans_date - min(a.trans_date) over (partition by a.customer_id order by customer_id, trans_date) as gap_day
, trunc((a.trans_date - min(a.trans_date) over (partition by a.customer_id order by customer_id, trans_date)) / 30) + 1 as gap_group
from jobdata a
) a
;