在 SQL 中查找序列的间隙,无需创建额外的表

问题描述 投票:0回答:3

我有一张桌子

invoices
,其中有一个字段
invoice_number
。这就是我执行
select invoice_number from invoice
时发生的情况:

invoice_number
--------------
1
2
3
5
6
10
11

我想要一个能够给出以下结果的 SQL:

gap_start | gap_end
4         | 4
7         | 9

如何编写 SQL 来执行此类查询? 我正在使用 PostgreSQL。

sql postgresql gaps-and-islands
3个回答
34
投票

这个问题的名称是“间隙和孤岛问题”,可以使用任何现代 SQL,使用窗口函数来完成:

select invoice_number + 1 as gap_start, 
       next_nr - 1 as gap_end
from (
  select invoice_number, 
         lead(invoice_number) over (order by invoice_number) as next_nr
  from invoices
) nr
where invoice_number + 1 <> next_nr;

SQLFiddle:http://sqlfiddle.com/#!15/1e807/1

这里使用 row_number 进行分区和间隔的演练示例:Postgres 连续天数、间隙和岛屿、Tabibitosan


5
投票

我们可以使用更简单的技术来首先获取所有缺失值,方法是连接生成的序列列,如下所示:

select series
from generate_series(1, 11, 1) series
left join invoices on series = invoices.invoice_number
where invoice_number is null;

这为我们提供了一系列缺失的数字,在某些情况下它本身就很有用。

要获取间隙开始/结束范围,我们可以将源表与其自身连接起来。

select invoices.invoice_number + 1 as start, 
       min(fr.invoice_number) - 1 as stop
from invoices
left join invoices r on invoices.invoice_number = r.invoice_number - 1
left join invoices fr on invoices.invoice_number < fr.invoice_number
where r.invoice_number is null
      and fr.invoice_number is not null
group by invoices.invoice_number,
         r.invoice_number;

dbfiddle:https://dbfiddle.uk/?rdbms=postgres_14&fiddle=32c5f3c021b0f1a876305a2bd3afafc9

这可能不如上述解决方案优化,但在不支持

lead()
功能的 SQL 服务器中可能很有用。


完全归功于 SILOTA 文档中这个出色的页面: http://www.silota.com/docs/recipes/sql-gap-analysis-missing-values-sequence.html

我强烈建议阅读它,因为它逐步解释了解决方案。


1
投票

我发现了另一个查询:

 select invoice_number + lag gap_start, 
        invoice_number + lead - 1 gap_end
    from (select invoice_number, 
                 invoice_number - lag(invoice_number) over w lag,
                 lead(invoice_number) over w - invoice_number lead 
              from invoices window w as (order by invoice_number)) x 
  where lag = 1 and lead > 1;
© www.soinside.com 2019 - 2024. All rights reserved.