了解多个分组集

问题描述 投票:0回答:1

我很清楚

GROUPING SETS
(以及
ROLLUP
CUBE
)如何用于单个表达式。然而,我从来没有完全理解当多个
GROUPING SETS
组合在一起时它是如何工作的。这是我创建的一个示例表来帮助解决这个问题:

CREATE TABLE movies AS (
    SELECT 'Black Widow' Movie, 'Disney' Studio, 2021 AS Year, 226583885 Revenue UNION ALL
    SELECT 'Black Widow', 'Disney', 2022, 126583885 UNION ALL
    SELECT 'Black Widow', 'Disney', 2023, 26583885 UNION ALL
    SELECT 'Spider-man: No Way Home', 'Sony', 2021, 740615703 UNION ALL
    SELECT 'Spider-man: No Way Home', 'Sony', 2022, 640615703 UNION ALL
    SELECT 'Spider-man: No Way Home', 'Sony', 2023, 540615703 UNION ALL
    SELECT 'Top Gun: Maverick', 'Paramount', 2022, 847848146 UNION ALL
    SELECT 'Top Gun: Maverick', 'Paramount', 2023, 647848146 UNION ALL
    SELECT 'The Batman', 'Warner Bros.', 2022, 486122791 UNION ALL
    SELECT 'The Batman', 'Warner Bros.', 2023, 286122791 UNION ALL
    SELECT 'Barbie', 'Warner Bros.', 2023, 1441769400 UNION ALL
    SELECT 'Oppenheimer', 'NBCUniversal', 2023, 950205530
)

如果我重复不属于

GROUP BY
一部分的
GROUPING SETS|ROLLUP|CUBE
元素,它似乎没有效果,例如:

select row_number() over () num, studio, sum(revenue) from movies group by studio;
┌──────────────────────┬──────────────┬──────────────┐
│ num                  ┆ Studio       ┆ sum(revenue) │
╞══════════════════════╪══════════════╪══════════════╡
│                    1 ┆ Disney       ┆    379751655 │
│                    2 ┆ Sony         ┆   1921847109 │
│                    3 ┆ Paramount    ┆   1495696292 │
│                    4 ┆ Warner Bros. ┆   2214014982 │
│                    5 ┆ NBCUniversal ┆    950205530 │
└──────────────────────┴──────────────┴──────────────┘

等同于:

select row_number() over () num, studio, sum(revenue) from movies group by studio, studio, studio;
┌──────────────────────┬──────────────┬──────────────┐
│ num                  ┆ Studio       ┆ sum(revenue) │
╞══════════════════════╪══════════════╪══════════════╡
│                    1 ┆ Disney       ┆    379751655 │
│                    2 ┆ Sony         ┆   1921847109 │
│                    3 ┆ Paramount    ┆   1495696292 │
│                    4 ┆ Warner Bros. ┆   2214014982 │
│                    5 ┆ NBCUniversal ┆    950205530 │
└──────────────────────┴──────────────┴──────────────┘

这看起来很简单。但是,当我添加

GROUPING SETS
时,当元素重复时,它确实会改变一些事情:

select row_number() over () num, studio, sum(revenue) from movies 
group by grouping sets(studio, ());
┌─────┬──────────────┬──────────────┐
│ num ┆ Studio       ┆ sum(revenue) │
╞═════╪══════════════╪══════════════╡
│   1 ┆ Disney       ┆    379751655 │
│   2 ┆ Sony         ┆   1921847109 │
│   3 ┆ Paramount    ┆   1495696292 │
│   4 ┆ Warner Bros. ┆   2214014982 │
│   5 ┆ NBCUniversal ┆    950205530 │
│   6 ┆              ┆   6961515568 │
└─────┴──────────────┴──────────────┘
Elapsed: 3 ms

select row_number() over () num, studio, sum(revenue) from movies 
group by grouping sets(studio, ()), grouping sets(studio, ());
┌─────┬──────────────┬──────────────┐
│ num ┆ Studio       ┆ sum(revenue) │
╞═════╪══════════════╪══════════════╡
│   1 ┆ Disney       ┆    379751655 │
│   2 ┆ Sony         ┆   1921847109 │
│   3 ┆ Paramount    ┆   1495696292 │
│   4 ┆ Warner Bros. ┆   2214014982 │
│   5 ┆ NBCUniversal ┆    950205530 │
│   6 ┆ Disney       ┆    379751655 │
│   7 ┆ Sony         ┆   1921847109 │
│   8 ┆ Paramount    ┆   1495696292 │
│   9 ┆ Warner Bros. ┆   2214014982 │
│  10 ┆ NBCUniversal ┆    950205530 │
│  11 ┆ Disney       ┆    379751655 │
│  12 ┆ Sony         ┆   1921847109 │
│  13 ┆ Paramount    ┆   1495696292 │
│  14 ┆ Warner Bros. ┆   2214014982 │
│  15 ┆ NBCUniversal ┆    950205530 │
│  16 ┆              ┆   6961515568 │
└─────┴──────────────┴──────────────┘
Elapsed: 2 ms

那么多个项目的

GROUPING SETS
是如何完成的呢?例如,如果
UNION
的等效
GROUP BY GROUPING SETS(studio, ())
-ed 子句是:

                                                                        -- GROUPING SETS(
SELECT studio, SUM(revenue) FROM movies GROUP BY studio UNION ALL       --   studio,
SELECT NULL  , SUM(revenue) FROM movies                                 --   ()
                                                                        -- )

那么如果

UNION ALL
中有多个项目,那么等效的
GROUP BY
是什么?


注意:上述查询已在 Postgres -- https://www.db-fiddle.com/f/hvrNtMSrAe7UMz6dTn33Y9/0 -- 和 SQL Server 中进行了测试(尽管我遇到了算术溢出,所以我修改了数字一点)https://dbfiddle.uk/LW3fXIIo

sql sql-server postgresql aggregate-functions
1个回答
0
投票

第一组,你按

studio
分组,第二组不分组

select row_number() over () num, studio, sum(revenue) 
from movies 
group by grouping sets(
  (studio)
  , ()
);
© www.soinside.com 2019 - 2024. All rights reserved.