用户提交他/她在特定内容中阅读的起始页和结束页的间隔 书。请注意,用户可以为同一本书提交多个间隔。 我需要查询宣布系统中最推荐的五本书,这些书是根据已阅读的唯一页面数来挑选的 在第一次操作中提交间隔的所有用户(按阅读页数最多的书籍到阅读页数最少的书籍排序)。
book_user 表是我需要查询的数据透视表,因此如何获取插入记录的以下结果:
阅读间隔:
User 1 read from page 10 to page 30 in Book 1
User 2 read from page 2 to page 25 in Book 1
User 1 read from page 40 to page 50 in Book 2
User 3 read from page 1 to page 10 in Book 2
The most read books results:
Book 1 -> 28 pages
Book 2 -> 20 pages
我尝试这个查询:
select 'book_id',books.name as book_name,SUM(end_page - start_page) AS num_of_read_pages FROM book_user JOIN books ON books.id=book_user.book_id GROUP BY book_id ORDER BY num_of_read_pages DESC;
但是它没有获得重叠间隔的唯一页面
当我问chatgpt时,它给了我这个递归cte查询,但它不起作用,只是循环
WITH RECURSIVE cte AS (
SELECT book_id, MIN(start_page) AS start_page, MAX(end_page) AS end_page
FROM book_user
GROUP BY book_id, start_page
UNION ALL
SELECT cte.book_id, cte.start_page, cte.end_page
FROM cte
JOIN book_user ON cte.book_id = book_user.book_id AND cte.start_page <= book_user.start_page AND cte.end_page >= book_user.end_page
)
SELECT book_id, SUM(end_page - start_page + 1) AS total_pages
FROM cte
GROUP BY book_id
ORDER BY total_pages DESC;
参见示例
with recursive t as(
-- join ranges with same start_page
-- and row_number() for sequence join
select book_id,start_page,max(end_page)end_page
,row_number()over(partition by book_id order by start_page) rn
from book_user
group by book_id,start_page
)
,r as( -- recursive join
-- anchor - ranges with a "free" start_page
select 0 lvl,bu.book_id,bu.start_page,bu.end_page,bu.rn
from t bu
where not exists(select 1 from t bu2
where bu2.book_id=bu.book_id and bu2.rn<bu.rn
and bu.start_page between bu2.start_page and bu2.end_page)
union all
select lvl+1,r.book_id,r.start_page,t.end_page,t.rn
from r inner join t on t.book_id=r.book_id and t.rn>r.rn
and r.end_page between t.start_page and r.end_page
)
select book_id,sum(end_page-start_page+1) total_pages
from ( -- again, we group segments with the same start_page and different end_page
select book_id,start_page,max(end_page) end_page
from r
group by book_id,start_page
) gr
group by book_id
详细信息在这里演示
输出
书号 | 总页数 | 路径 |
---|---|---|
1 | 38 | 2-2:25,26,2:25,26:31,33-33:40 |
2 | 25 | 1-1:10,1:10:11,1:10:11:14,40-40:50 |
3 | 31 | 1-1:10,20-20:40 |
测试数据
create table books(id int,book_name varchar(20));
insert into books values(1,'Book 1'),(2,'Book 2'),(3,'Book 3');
create table users(id int,user_name varchar(20));
insert into users values(10,'User 10'),(20,'User 20'),(30,'User 40'),(30,'User 40');
create table book_user(user_id int,book_id int,start_page int,end_page int);
insert into book_user values
(10,1, 10,30)
,(20,1, 2,25)
,(30,1, 2,26)
,(30,1, 10,31)
,(40,1, 33,40)
,(10,2, 40,50)
,(30,2, 1,10)
,(40,2, 10,11)
,(20,2, 11,14)
,(10,3, 1,10)
,(20,3, 20,40)
;