查找每个国家/地区的顶级用户数量

问题描述 投票:0回答:2

我有以下表格来模拟图书数据库:

CREATE TABLE Country (
    ISO_3166 CHAR(2) PRIMARY KEY,
    CountryName VARCHAR(256),
    CID varchar(16)
);
CREATE TABLE Users (
    UID INT PRIMARY KEY,
    Username VARCHAR(256),
    DoB DATE,
    Age INT,
    ISO_3166 CHAR(2) REFERENCES Country (ISO_3166)
);
CREATE TABLE Book (
    ISBN VARCHAR(17) PRIMARY KEY,
    Title VARCHAR(256),
    Published DATE,
    Pages INT,
    Language VARCHAR(256)
);
CREATE TABLE Rating (
    UID INT REFERENCES Users (UID),
    ISBN VARCHAR(17) REFERENCES Book (ISBN),
    PRIMARY KEY (UID,ISBN),
    Rating int
);

我现在想找到每个国家评分最高的用户。我可以使用这个查询:

SELECT Country.CountryName as CountryName, Users.Username as Username, COUNT(Rating.Rating) as NumRatings
FROM Country
JOIN Users ON Users.ISO_3166 = Country.ISO_3166 
JOIN Rating ON Users.UID = Rating.UID
GROUP BY Country.CID, CountryName, Username
ORDER BY CountryName ASC

以以下格式返回每个用户的评分数:

 Countryname | Username | Number of Ratings of this user

我还管理了以下查询,它为每个国家/地区提供一个用户,但它不是收视率最高的那个:

SELECT DISTINCT ON (CountryName)
        CountryName, Username, MAX(NumRatings)
FROM (
    SELECT Country.CountryName as CountryName, Users.Username as Username, COUNT(Rating.Rating) as NumRatings
        FROM Country
        JOIN Users ON Users.ISO_3166 = Country.ISO_3166 
        JOIN Rating ON Users.UID = Rating.UID
        GROUP BY Country.CID, CountryName, Username
        ORDER BY CountryName ASC) AS MyTable
GROUP BY CountryName, Username, NumRatings 
ORDER BY CountryName ASC;

但是如何编写一个查询来选择每个国家/地区的最大用户?

sql postgresql greatest-n-per-group
2个回答
1
投票

你是如此接近:

SELECT DISTINCT ON (CountryName)
        CountryName, Username, NumRatings
FROM(
    SELECT Country.CountryName as CountryName, Users.Username as Username, COUNT(Rating.Rating) as NumRatings
        FROM Country
        JOIN Users ON Users.ISO_3166 = Country.ISO_3166 
        JOIN Rating ON Users.UID = Rating.UID
        GROUP BY Country.CID, CountryName, Username
        ORDER BY CountryName ASC) AS MyTable
WHERE TRUE --no filtering needed 
ORDER BY CountryName ASC, NumRatings DESC

Postgres 允许您在要区分的列由多行表示时进行排序以确定包含哪条记录。在这种情况下,按 NumRatings 降序排序应该会为您提供每个国家/地区具有最高 NumRatings 值的行中的值。


0
投票

DISTINCT ON
很好,很容易获得每个国家收视率最高的one(正如“不同”一词所暗示的)用户。参见:

但是你想...

找到每个国家评分最高的用户。

每个国家/地区的收视率最高。
我想先聚合评级,然后加入用户表 - 在 CTE 中。然后使用

LATERAL
:
WITH TIES

子查询中为每个国家选择一个或多个获胜者
WITH agg AS (
   SELECT u.iso_3166, u.uid, u.username, r.numratings
   FROM  (
      SELECT uid, count(*) AS numratings
      FROM   rating r
      GROUP  BY 1
      ) r
   JOIN   users u USING (uid)
   )
SELECT c.countryname, a.username, a.numratings
FROM   country c
LEFT   JOIN LATERAL (
   SELECT *
   FROM   agg a
   WHERE  a.iso_3166 = c.iso_3166
   ORDER  BY a.numratings DESC
   FETCH  FIRST 1 ROWS WITH TIES  -- !
   ) a ON true;

关于“先聚合,后加入”:

关于

WITH TIES

关于

LATERAL

值得注意的是,你确实 not 想要

GROUP BY Country.CID
country.ISO_3166
是PK,改用它。 (我优化了查询,所以我根本不需要
GROUP BY
中的国家。)

© www.soinside.com 2019 - 2024. All rights reserved.