我试图弄清楚如何使用 SQL 显示前一个兄弟的父分组的聚合值。希望下面的例子能让这一点更清楚:
鉴于此表:
地区 | 部门 | 成本 |
---|---|---|
东 | 服装 | 45 |
东 | 服装 | 35 |
东 | 电子 | 120 |
南 | 服装 | 20 |
南 | 服装 | 25 |
西 | 服装 | 40 |
西 | 电子 | 150 |
西 | 电子 | 140 |
我想创建一个查询,它将返回以下结果集:
地区 | 部门 | 成本总和 | 上一个地区总和 |
---|---|---|---|
东 | 服装 | 80 | 空 |
东 | 电子 | 120 | 空 |
南 | 服装 | 45 | 200 |
西 | 服装 | 40 | 45 |
西 | 电子 | 290 | 45 |
基本上,我希望按区域和部门对数据进行分组和聚合,但我还想参考基于先前区域的区域分组的聚合(假设它们按字母顺序排序)。
因此,如果您查看结果集中的“上一个区域总和”列 - 您可以看到前 2 个东行获得
null
,因为“东”之前没有区域,所以下一个南行获得 200,因为此是所有“东”记录的成本总和,“西”行得到 45,因为这是所有南记录的成本总和。
如果您想尝试一下,这里是此示例的 SQL。正如您所看到的,除了“上一个区域总和”列的表达式之外,我拥有所有内容:
CREATE TABLE so_sales (
Region VARCHAR,
Department VARCHAR,
Cost INTEGER
);
INSERT INTO so_sales (Region, Department, Cost)
VALUES
('East', 'Clothing', 45),
('East', 'Clothing', 35),
('East', 'Electronics', 120),
('South', 'Clothing', 20),
('South', 'Clothing', 25),
('West', 'Clothing', 40),
('West', 'Electronics', 150),
('West', 'Electronics', 140);
SELECT
Region,
Department,
SUM(Cost) AS "Sum of Cost",
LAG(SUM(Cost)) OVER (PARTITION BY Region ORDER BY Region, Department) AS "Prev Region Sum"
FROM
so_sales
GROUP BY
Region,
Department
ORDER BY
Region,
Department;
with data(Region, Department, Cost) as (
select 'East', 'Clothing', 45 union all
select 'East', 'Clothing', 35 union all
select 'East', 'Electronics', 120 union all
select 'South', 'Clothing', 20 union all
select 'South', 'Clothing', 25 union all
select 'West', 'Clothing', 40 union all
select 'West', 'Electronics', 150 union all
select 'West', 'Electronics', 140
)
select region, Department, sum_of_cost, region_sum,
lag(region_sum, rn::int) over(order by region, rn) as prev_region_sum
from (
select region, Department, sum_of_cost,
sum(sum_of_cost) over(partition by region order by region) as region_sum,
row_number() over(partition by region order by department) as rn
from (
select
region, Department,
sum(cost) as sum_of_cost
from data
group by Region, Department
) d
) d
order by region, department
;