我有以下数据,我正在尝试获取上一年的利润:
WITH tbl (year, country, product, profit) AS (
VALUES
(2000, 'Finland', 'Computer' , 1500)
, (2000, 'Finland', 'Phone' , 100)
, (2001, 'Finland', 'Phone' , 10)
, (2000, 'India' , 'Calculator', 75)
, (2000, 'India' , 'Calculator', 75)
, (2000, 'India' , 'Computer' , 1200)
, (2001, 'India' , 'Computer' , 1200)
, (2001, 'India' , 'Computer' , 1200)
, (2002, 'India' , 'Computer' , 1200)
, (2002, 'India' , 'Computer' , 1200)
)
select country, year, profit
, lag(profit) over (partition by country order by year)
from tbl;
┌─────────┬──────┬────────┬──────────────────────────┐
│ country ┆ year ┆ profit ┆ sum_profit_previous_year │
╞═════════╪══════╪════════╪══════════════════════════╡
│ India ┆ 2000 ┆ 75 ┆ │
│ India ┆ 2000 ┆ 75 ┆ 75 │
│ India ┆ 2000 ┆ 1200 ┆ 75 │
│ Finland ┆ 2000 ┆ 1500 ┆ │
│ Finland ┆ 2000 ┆ 100 ┆ 1500 │
│ Finland ┆ 2001 ┆ 10 ┆ 100 │
└─────────┴──────┴────────┴──────────────────────────┘
但是,这似乎只是获得了前一行,而不是我想要的,即获得该国家上一年的利润值的
LAG
。预期结果应该是:
┌─────────┬──────┬────────┬──────────────────────────┐
│ country ┆ year ┆ profit ┆ sum_profit_previous_year |
╞═════════╪══════╪════════╪══════════════════════════╡
│ India ┆ 2000 ┆ 75 ┆ │
│ India ┆ 2000 ┆ 75 ┆ │
│ India ┆ 2000 ┆ 1200 ┆ │
│ Finland ┆ 2000 ┆ 1500 ┆ │
│ Finland ┆ 2000 ┆ 100 ┆ │
│ Finland ┆ 2001 ┆ 10 ┆ 1600 │
└─────────┴──────┴────────┴──────────────────────────┘
自2001年芬兰以来,这是唯一一个同时拥有同一国家上一年记录的记录。实现此目的正确的
RANGE
子句是什么? (BigQuery 或 Postgres 都适合测试目的)。
如果您确实想要所有未聚合行,以及上一年的聚合利润...
这是一种使用普通窗口函数和自定义窗口框架的方法:
SELECT t.country, t.year, t.profit
, sum(profit) OVER (PARTITION BY country ORDER BY year
RANGE BETWEEN 1 PRECEDING AND 1 PRECEDING) AS profit_previous_year
FROM tbl t
ORDER BY t.country DESC, t.year; -- optional?
不带窗口函数的替代方案:
SELECT t.country, t.year, t.profit
, y.profit AS profit_previous_year
FROM tbl t
LEFT JOIN (
SELECT country, year, sum(profit) AS profit
FROM tbl t1
GROUP BY 1, 2
) y ON y.country = t.country
AND y.year = t.year - 1
ORDER BY t.country DESC, t.year;