这是该站点上先前文章的延续:Finding Gaps in Timestamps for Multiple Users in PostgreSQL
我正在使用一个包含过去5年中多个办公室的入住和退房时间的数据集。要求我进行的项目之一是在设定的工作时间(上午7:30至下午5点)下,计算每个房间在不同时间范围(每天,每周,每月等)中繁忙和空置的时间。 与我的上一篇文章不同,有重叠时间范围的实例。一天的数据集样本如下所示:
room_id check_in check_out
"Room D" "2014-07-18 12:23:00" "2014-07-18 12:54:00"
"Room D" "2014-07-19 09:16:00" "2014-07-19 10:30:00"
"Room D" "2014-07-19 09:10:00" "2014-07-19 10:30:00"
"Room D" "2014-07-18 08:45:00" "2014-07-18 22:40:00"
"Room 5" "2014-07-19 10:20:00" "2014-07-19 12:20:00"
"Room 5" "2014-07-18 07:59:00" "2014-07-18 09:00:00"
"Room 5" "2014-07-18 09:04:00" "2014-07-18 14:00:00"
"Room 5" "2014-07-18 07:59:00" "2014-07-18 10:00:00"
从我以前的文章中,我得到了非常有用的代码片段,它对于没有重叠的所有实例都非常适用正如作者指出的那样:
select date_trunc('day', start_dt), room_id,
sum( least(extract(epoch from end_dt), v.epoch2) -
greatest(extract(epoch from start_dt), epoch1)
) as busy_seconds,
(epoch2 - epoch1 -
sum( least(extract(epoch from end_dt), v.epoch2) -
greatest(extract(epoch from start_dt), epoch1)
)
) as free_seconds
from rooms r cross join
(values (extract(epoch from date_trunc('day', start_dt) + interval '7 hours 30 minutes'),
extract(epoch from date_trunc('day', start_dt) + interval '17 hour')
)
) v(epoch1, epoch2)
group by date_trunc('day', start_dt), room_id
但是,在浏览了我们的数据之后,重叠时间范围的实例比我预期的要多。这是我想从上面的样本数据中检索的目标输出:
target_day room_id busy_time Free Time
2014-07-18 Room D 8.25 1.25
2014-07-19 Room 4 1.33 8.17
2014-07-18 Room 5 8 1.5
2014-07-19 Room 5 2 7.5
我现在正在学习PostgreSQL,所以这个问题有点困扰我。任何帮助或指导将不胜感激!
作为查询:
with r as (
select room_id, min(start_dt) as start_dt, max(end_dt) as end_ddt
from (select r.*,
count(*) over (filter where prev_end_dt < start_dt) over (partition by room_id date_trunc('day', start_dt) order by start_dt) as grp
from (select r.*,
max(end_dt) over (partition by room_id, date_trunc('day', start_dt) rows between unbounded preceding and 1 preceding) as prev_end_dt
from rooms r
) r
) r
group by room_id, grp
)
select date_trunc('day', start_dt), room_id,
sum( least(extract(epoch from end_dt), v.epoch2) -
greatest(extract(epoch from start_dt), epoch1)
) as busy_seconds,
(epoch2 - epoch1 -
sum( least(extract(epoch from end_dt), v.epoch2) -
greatest(extract(epoch from start_dt), epoch1)
)
) as free_seconds
from r cross join
(values (extract(epoch from date_trunc('day', start_dt) + interval '7 hours 30 minutes'),
extract(epoch from date_trunc('day', start_dt) + interval '17 hour')
)
) v(epoch1, epoch2)
group by date_trunc('day', start_dt), room_id