MySQL查询查找连续几天下订单的客户

问题描述 投票:0回答:1

我正在尝试获取连续几天下订单的客户的ID。该表创建如下:

create table orders(
  orderid INT,
  orderdate date,
  customerid int
);

价值观:

insert into orders (orderid, orderdate, customerid)
values(1,'2023-06-20',1),
    (2, '2023-06-21', 2),
    (3, '2023-06-22', 3),
    (4, '2023-06-22', 1),
    (5, '2023-06-23', 3),
    (6, '2023-06-22', 1),
    (7, '2023-06-26', 4),
    (8, '2023-06-27', 4),
    (9, '2023-06-29', 4),
    (10, '2023-06-29', 5),
    (11, '2023-06-30', 5),
    (12, '2023-06-28', 5),
    (13, '2023-06-25', 4),
    (14, '2023-06-24', 4),
    (15, '2023-06-30', 4);

我编写的代码给出了连续几天有订单的 id 的输出,但留下了订单中有间隙的客户的 id,尽管在间隙实际发生之前订单数量较多。 我写的代码:

with t1 as(
select customerid, orderdate,
case when lead(orderdate) over (partition by customerid order by orderdate) is null then 1
else abs(orderdate - lead(orderdate) over (partition by customerid order by orderdate)) end as gap
from orders)
select customerid, sum(gap) as consecutive
from t1
where gap>0
group by customerid
having count(*)=sum(gap) and count(*)>1;

输出:

+------------+------------------+
| customerid | consecutive_days |
+------------+------------------+
|          3 |                2 |
|          5 |                3 |
+------------+------------------+

我想要的输出:

+------------+------------------+
| customerid | consecutive_days |
+------------+------------------+
|          3 |                2 |
|          4 |                4 |
|          4 |                2 |
|          5 |                3 |
+------------+------------------+

由于 customerid 4 的客户已在 2023-06-24 至 2023-06-27 期间订购。同一客户的下一个订单是在 2023 年 6 月 29 日和 2023 年 6 月 30 日,因此不连续,应作为单独的行出现。

编辑:下的订单必须是连续几天的,无论单日下的订单数量是多少。

mysql window-functions gaps-and-islands
1个回答
1
投票

这是一个典型的间隙和孤岛问题,这是获得所需结果的一种方法:

SELECT customerid, COUNT(DISTINCT orderdate) AS consecutive_days
FROM (
    SELECT *, orderdate - INTERVAL DENSE_RANK() OVER (PARTITION BY customerid ORDER BY orderdate) DAY AS grp
    FROM orders
) AS order_groups
GROUP BY customerid, grp
HAVING consecutive_days > 1;

输出:

客户ID 连续_天
3 2
4 4
4 2
5 3

这是一个db<>小提琴

如果您查看派生表的结果,您可以看到,通过从订单日期中减去 DENSE_RANK,我们创建了一个可用于分组的值。

订单编号 订购日期 客户ID grp
1 2023-06-20 1 2023-06-19
4 2023-06-22 1 2023-06-20
6 2023-06-22 1 2023-06-20
2 2023-06-21 2 2023-06-20
3 2023-06-22 3 2023-06-21
5 2023-06-23 3 2023-06-21
14 2023-06-24 4 2023-06-23
13 2023-06-25 4 2023-06-23
7 2023-06-26 4 2023-06-23
8 2023-06-27 4 2023-06-23
9 2023-06-29 4 2023-06-24
15 2023-06-30 4 2023-06-24
12 2023-06-28 5 2023-06-27
10 2023-06-29 5 2023-06-27
11 2023-06-30 5 2023-06-27

请注意,我使用了 DENSE_RANK,而不是更典型的 ROW_NUMBER,来处理同一天下多个订单的客户。

© www.soinside.com 2019 - 2024. All rights reserved.