如何在SQL中按连续记录分组

问题描述 投票:1回答:3

我的桌子上有这些记录

ID  Colour
------------
 1   Red
 2   Red
 3   Red
 4   Red
 5   Red
 6   Green
 7   Green
 8   Green
 9   Green
10   Red
11   Red
12   Red
13   Red
14   Green
15   Green
16   Green
17   Blue
18   Blue
19   Red
20   Blue

我可以像这样容易地按颜色分组

SELECT Colour, MIN(ID) AS iMin, MAX(ID) AS iMax
FROM MyTable
GROUP BY Colour

这将返回此结果

Colour     iMin     iMax
-------------------------
Red        1        19
Green      6        16
Blue       17       20

但是这不是我想要的,因为红色不会一直从1到19,绿色会破坏序列。

结果应该是这样

Colour     iMin     iMax
------------------------
Red        1        5
Green      6        9
Red        10       13
Green      14       16
Blue       17       18
Red        19       19
Blue       20       20

我设法通过游标执行此操作,但想知道是否有更有效的方法来执行此操作

sql sql-server date window-functions gaps-and-islands
3个回答
2
投票

这是一个孤岛问题。假设id连续递增,则可以使用row_number()之间的差异来定义具有相同colour的“相邻”记录组:

select 
    colour, 
    min(id) iMin,
    max(id) iMax
from (
    select t.*, row_number() over(partition by colour order by id) rn
    from mytable t
) t
group by colour, id - rn
order by min(id)

Demo on DB Fiddle

颜色iMin | iMax:----- | ---: ---:红色| 1 | 5绿色| 6 | 9红色| 10 | 13绿色| 14 | 16蓝色| 17 | 18岁红色| 19 | 19蓝色| 20 | 20

2
投票

这是一个差距和孤岛的问题。您可以使用行号的不同来解决此问题:

select colour, min(id), max(id)
from (select t.*,
             row_number() over (order by id) as seqnum,
             row_number() over (partition by colour order by id) as seqnum_c
      from t
     ) t
group by colour, (seqnum - seqnum_c);

Here是db <>小提琴。

很难解释它是如何工作的。但是,如果您查看子查询的结果,则会看到行号的不同如何识别相邻的颜色。


0
投票

无论id列是否为整数且id列的值是连续的,该查询均有效。

;with c0 as(
select id, color,
       ROW_NUMBER() over(order by id)*
       (case when color <> LAG(color, 1, '') over(order by id) then 1 else 0 end) as color_id
from #temp
), c1 as(
select id, color, color_id, SUM(color_id) over(order by id) as color_gid
from c0
)
select color, MIN(id) as idMin, MAX(id) as idMax
from c1
group by color, color_gid

它可以扩展为按a列排序,按b列的连续值分组,并找到c列的聚合值,如下所示:

;with c0 as(
select C, B,
       ROW_NUMBER() over(order by A)*
       (case when B <> LAG(B, 1, '') over(order by A) then 1 else 0 end) as B_id
from TableName
), c1 as(
select C, B, B_id, SUM(B_id) over(order by A) as B_gid
from c0
)
select B, MIN(C) as CMin, MAX(C) as CMax
from c1
group by B, B_gid
© www.soinside.com 2019 - 2024. All rights reserved.