我试图在这个问题集上做#8,从sqlzoo (https:/sqlzoo.netwikiWindow_LAG#LAG_using_a_JOIN。).
问题是 "对于每一个在最近一天内有1000个新病例的国家,请说明新病例数量高峰的日期"。
表covid给出了各国每天的covid病例数、死亡数和恢复数,为所以。
+-------------+-------------------------------+-----------+--------+-----------+
| Name | whn | confirmed | deaths | recovered |
+-------------+-------------------------------+-----------+--------+-----------+
| Afghanistan | Sun, 01 Mar 2020 00:00:00 GMT | 1 | 0 | 0 |
| Albania | Sun, 01 Mar 2020 00:00:00 GMT | 0 | 0 | 0 |
| Algeria | Sun, 01 Mar 2020 00:00:00 GMT | 1 | 0 | 0 |
+-------------+-------------------------------+-----------+--------+-----------+
目前我有这个代码:
SELECT c.name, DATE_FORMAT(c.whn,'%Y-%m-%d') as this, d.peak
from ( select tw.name, max(tw.confirmed-lw.confirmed) as peak
FROM covid tw LEFT JOIN covid lw ON
DATE_ADD(lw.whn, INTERVAL 1 DAY) = tw.whn
AND tw.name=lw.name
where tw.confirmed-lw.confirmed > 1000
group by tw.name) d
join covid as c
on d.name = c.name
group by name
这给了我每个国家,日期, 和高峰期的案件数量。然而,日期是显示每个国家的第一天,当案件超过1000。我怎样才能得到案件数量高峰期的日期?
| Name | this | peak |
|---------|------------|------|
| Austria | 2020-03-26 | 1321 |
| Belarus | 2020-04-20 | 1485 |
| Belgium | 2020-03-26 | 2454 |
你可以计算出 新的 的情况下,通过比较 confirmed
跨越以后的日子;对于这。lag()
很方便。
select
t.*,
confirmed - lag(confirmed, 1, 0) over(partition by name order by whn) new_cases
from mytable t
假设每个国家每天都有一条记录。然后,你可以根据这些记录对每个国家的记录进行排名,并对每个国家排名最高的一天进行过滤。
select *
from (
select
t*,
rank() over(partition by name order by new_cases desc) rn
from (
select
t.*,
confirmed - lag(confirmed, 1, 0) over(partition by name order by whn) new_cases
from mytable t
) t
where new_cases > 1000
) t
where rn = 1