给定一个表格,
NAME DATE_OCCURRED
a 2020-05-14 23:48.07
a 2020-05-14 23:48.07
a 2020-05-14 23:48.08
b 2020-05-14 23:48.08
b 2020-05-14 23:48.08
b 2020-05-14 23:48.08
b 2020-05-14 23:48.09
我想为每个人返回 NAME
,最多记录存在的时间;以及该时间存在多少记录。
NAME MAXCOUNT_PER_SECOND DATE_OCCURRED
a 2 2020-05-14 23:48.07
b 3 2020-05-14 23:48.08
我已经找到了用DATE_OCCURRED和NAME来分组的SQL语句。
SELECT COUNT(*) AS COUNT_PER_SECOND, NAME, DATE_OCCURRED FROM TABLE GROUP BY NAME, DATE_OCCURRED ORDER BY NAME ASC, COUNT_PER_SECOND DESC
但我现在想再选择一次,只选每个名字的最大值。我已经试过了。
SELECT MAX(COUNT_PER_SECOND) AS MAXCOUNT_PER_SECOND, NAME FROM (the above query) GROUP BY NAME;
这样就能得到 MAXCOUNT_PER_SECOND
和 NAME
但只要我一尝试,也会得到哪种。DATE_OCCURRED
导致 MAXCOUNT_PER_SECOND
值,我最终在运行SQL时出现分组错误;或者得不到我期望的结果。
即
SELECT MAX(COUNT_PER_SECOND) AS MAXCOUNT_PER_SECOND, NAME, DATE_OCCURRED FROM (the above query) GROUP BY NAME;
-> Not a GROUP BY expression
SELECT MAX(COUNT_PER_SECOND) AS MAXCOUNT_PER_SECOND, NAME, DATE_OCCURRED FROM (the above query) GROUP BY NAME, DATE_OCCURRED ;
-> 运行,但给我所有日期的结果,而不是只给最大日期。
你可以使用aggreagation和window函数。
select name, date_occured, no_records
from (
select
name,
date_occured,
count(*) no_records,
rank() over(partition by name order by count(*) desc) rn
from mytable
group by name, date_occured
) t
where rn = 1
子查询的aggreagates是 name
和 date_occured
,对每组记录进行统计,并对具有相同记录的组进行排名。name
按降序计数。然后,外部查询会对最上面的记录进行过滤,每条记录的数量为 name
. 由于我们使用 rank()
,可能的顶部连接将被包含在结果集中(如果您不希望这样,请使用 row_number()
而不是)。)