我有一张桌子,其中一个人有几排不同的会话,带有开始和结束日期。第一行的结束日期可以与下一行的开始日期相同,因此,由于会话实际上仍在继续,因此我想将它们合并为一行。
这是我所拥有的一个例子:
create table dates (USER_ID varchar(100),
start datetime,
end datetime);
insert into dates values ('1','2014-06-01','2014-07-01');
insert into dates values ('1','2014-07-01','2014-08-01');
insert into dates values ('1','2014-08-01','2014-09-01');
insert into dates values ('2','2014-07-01','2014-08-01');
insert into dates values ('2','2014-08-01','2014-09-01');
select * from dates;
+---------+---------------------+---------------------+
| USER_ID | start | end |
+---------+---------------------+---------------------+
| 1 | 2014-06-01 00:00:00 | 2014-07-01 00:00:00 |
| 1 | 2014-07-01 00:00:00 | 2014-08-01 00:00:00 |
| 1 | 2014-08-01 00:00:00 | 2014-09-01 00:00:00 |
| 2 | 2014-07-01 00:00:00 | 2014-08-01 00:00:00 |
| 2 | 2014-08-01 00:00:00 | 2014-09-01 00:00:00 |
+---------+---------------------+---------------------+
这就是我想要的:
+---------+---------------------+---------------------+
| USER_ID | start | end |
+---------+---------------------+---------------------+
| 1 | 2014-06-01 00:00:00 | 2014-09-01 00:00:00 |
| 2 | 2014-07-01 00:00:00 | 2014-09-01 00:00:00 |
+---------+---------------------+---------------------+
提前感谢。
使用聚合和分组依据
select user_id, min(start) as start, max(end) as end
from tablename
group by user_id
这是一个孤岛问题。我建议使用左连接来确定孤岛的起点,然后进行聚合来解决这一问题:
select user_id, min(start), max(end)
from (select t.*,
sum(tprev.user_id is null) over (partition by t.user_id order by t.start) as grp
from t left join
t tprev
on tprev.user_id = t.user_id and
tprev.end = t.start
) t
group by user_id, grp;