我正在尝试使用 March_recognize 来查找连续 10 天或以上进行购买的客户。
根据我的示例数据,我预计客户 1 会出现在我的输出中,但事实并非如此。有人可以解释一下为什么我找不到数据吗?另外向我展示如何修复查询以使其正常工作。
ALTER SESSION SET NLS_TIMESTAMP_FORMAT = 'DD-MON-YYYY HH24:MI:SS.FF';
create table purchases(
ORDER_ID NUMBER GENERATED BY DEFAULT AS IDENTITY (START WITH 1) NOT NULL,
customer_id number,
PRODUCT_ID NUMBER,
QUANTITY NUMBER,
purchase_date timestamp
);
insert into purchases (customer_id, product_id, quantity, purchase_date)
select 1 customer_id, 102 product_id, 1 quantity,
DATE '2024-04-08' + INTERVAL '13' HOUR + ((LEVEL-1) * INTERVAL '1 00:00:01' DAY TO SECOND) * -1
as purchase_date
from dual
connect by level <= 15 UNION all
select 2, 102, 1,
DATE '2024-03-08' + INTERVAL '14' HOUR + ((LEVEL-1) * INTERVAL '1 00:00:00' DAY TO SECOND) * -1
from dual
connect by level <= 5 UNION ALL
select 3, 102, 1,
DATE '2024-02-08' + INTERVAL '15' HOUR + ((LEVEL-1) * INTERVAL '0 23:59:59' DAY TO SECOND) * -1
from dual
connect by level <= 5;
select * from purchases
match_recognize(
partition by customer_id
order by purchase_date
measures
first(purchase_date) as first_date,
last(purchase_date) as last_date
one row per match
pattern(P{10,})
define P as next(purchase_date)=purchase_date + interval '1' day
);
在执行“下一个(购买日期)=购买日期 + 间隔'1'天”时,您定义了一个过于严格的约束,因为您的日期有 H:M:S... 您可以在日期周围添加 TRUNC() 以获得您想要的内容。
正如@p3consulting所说,你太准确了。您已将客户 1 的每行之间的间隔创建为一天零一秒。对于您当前的 P,如果
purchase_date
是 2024 年 3 月 25 日 12:59:46.000000,则添加 1 天,得到 26-MAR-2024 12:59:46.000000
,并且没有具有该值的行;下一个是 26-MAR-2024 12:59:47.000000(一秒后),因此它在相等性上不匹配。
如果将 P 定义为:
define P as next(purchase_date) >= trunc(purchase_date) + interval '1' day
and next(purchase_date) < trunc(purchase_date) + interval '2' day
然后它将匹配一天中的任何时间与第二天的任何时间。
select * from purchases
match_recognize(
partition by customer_id
order by purchase_date
measures
first(purchase_date) as first_date,
last(purchase_date) as last_date
one row per match
pattern(P{10,})
define P as next(purchase_date) >= trunc(purchase_date) + interval '1' day
and next(purchase_date) < trunc(purchase_date) + interval '2' day
);
CUSTOMER_ID | FIRST_DATE | LAST_DATE |
---|---|---|
1 | 2024 年 3 月 25 日 12:59:46.000000 | 2024 年 4 月 7 日 12:59:59.000000 |