如何在某些情况下获得前N行

问题描述 投票:0回答:1

我有这样的查询:

SELECT product_id,
         site,
         category_id,
         session_time,
         sum(cast(coalesce("#clicks",
         0) AS bigint)) AS clicks
FROM df
WHERE site IN ('com', 'co')
        AND session_time = DATE('2020-02-27')
GROUP BY  product_id, site, session_time, category_id
ORDER BY clicks desc
LIMIT 10

但是现在,我想查看每个站点的前10个product_id和基于点击的category_id。当我编写LIMIT函数时,它仅显示前10个产品,但没有按category_id和shop_id对其进行分组。

我该怎么做?

sql group-by sum greatest-n-per-group amazon-athena
1个回答
1
投票
使用窗口功能。您可以通过在子查询的RANK() / clicks分区中将site降序,然后在外部查询中进行过滤来category记录:

SELECT * FROM ( SELECT product_id, site, category_id, session_time, SUM("#clicks") clicks, RANK() OVER(PARTITION BY site, category_id ORDER BY sum("#clicks") DESC) rn FROM df WHERE site IN ('com', 'co') AND session_time = DATE('2020-02-27') GROUP BY product_id, site, session_time, category_id ) t WHERE rn <= 10 ORDER BY site, category, clicks desc

我不清楚为什么在coalesce()中需要cast() / sum()逻辑(就像其他聚合函数一样,sum()忽略了null值,并且似乎#clicks已经是一个数字),因此我将其删除了-如果出于某些我无法想到的原因,可以将其添加回去。
© www.soinside.com 2019 - 2024. All rights reserved.