我的桌子上有这些记录
ID Colour
------------
1 Red
2 Red
3 Red
4 Red
5 Red
6 Green
7 Green
8 Green
9 Green
10 Red
11 Red
12 Red
13 Red
14 Green
15 Green
16 Green
17 Blue
18 Blue
19 Red
20 Blue
我可以像这样容易地按颜色分组
SELECT Colour, MIN(ID) AS iMin, MAX(ID) AS iMax
FROM MyTable
GROUP BY Colour
这将返回此结果
Colour iMin iMax
-------------------------
Red 1 19
Green 6 16
Blue 17 20
但是这不是我想要的,因为红色不会一直从1到19,绿色会破坏序列。
结果应该是这样
Colour iMin iMax
------------------------
Red 1 5
Green 6 9
Red 10 13
Green 14 16
Blue 17 18
Red 19 19
Blue 20 20
我设法通过游标执行此操作,但想知道是否有更有效的方法来执行此操作
这是一个孤岛问题。假设id
连续递增,则可以使用row_number()
之间的差异来定义具有相同colour
的“相邻”记录组:
select
colour,
min(id) iMin,
max(id) iMax
from (
select t.*, row_number() over(partition by colour order by id) rn
from mytable t
) t
group by colour, id - rn
order by min(id)
颜色iMin | iMax:----- | ---: ---:红色| 1 | 5绿色| 6 | 9红色| 10 | 13绿色| 14 | 16蓝色| 17 | 18岁红色| 19 | 19蓝色| 20 | 20
这是一个差距和孤岛的问题。您可以使用行号的不同来解决此问题:
select colour, min(id), max(id)
from (select t.*,
row_number() over (order by id) as seqnum,
row_number() over (partition by colour order by id) as seqnum_c
from t
) t
group by colour, (seqnum - seqnum_c);
Here是db <>小提琴。
很难解释它是如何工作的。但是,如果您查看子查询的结果,则会看到行号的不同如何识别相邻的颜色。
无论id列是否为整数且id列的值是连续的,该查询均有效。
;with c0 as(
select id, color,
ROW_NUMBER() over(order by id)*
(case when color <> LAG(color, 1, '') over(order by id) then 1 else 0 end) as color_id
from #temp
), c1 as(
select id, color, color_id, SUM(color_id) over(order by id) as color_gid
from c0
)
select color, MIN(id) as idMin, MAX(id) as idMax
from c1
group by color, color_gid
它可以扩展为按a列排序,按b列的连续值分组,并找到c列的聚合值,如下所示:
;with c0 as(
select C, B,
ROW_NUMBER() over(order by A)*
(case when B <> LAG(B, 1, '') over(order by A) then 1 else 0 end) as B_id
from TableName
), c1 as(
select C, B, B_id, SUM(B_id) over(order by A) as B_gid
from c0
)
select B, MIN(C) as CMin, MAX(C) as CMax
from c1
group by B, B_gid