如何使用 SQL 在 Amazon Athena 环境中创建滞后年变量

问题描述 投票:0回答:1

如何在某些特定条件下生成从宽到长的表格?

This is my original data, each id has three columns, year is their index year and y-1 is the year prior to the index year, y-2 is 2 years prior to the index year, each id only has 1 row

然后我希望重塑表格,从宽到长,只制作一个名为年份的变量,索引年份,包含前一年和索引年份之前的两年,但还有一个条件。我想将每个 bene_id 的年份延长到 2021 年。像下面这样的东西。 This is what I am looking for..

有人可以给我一些建议吗? 谢谢!

我使用下面的代码,但仍然得到错误的表.. 注意:我在 Amazon Athena 中生成代码

  WITH RECURSIVE YearSequence(bene_id, Year) AS (
    SELECT bene_id, year
    FROM tablea 
    UNION ALL
    SELECT ts.bene_id, YearSequence.Year - 1
    FROM YearSequence
    JOIN tablea  AS ts ON YearSequence.bene_id = ts.bene_id
    WHERE YearSequence.Year > 2016 
  )
  SELECT bene_id, Year
  FROM YearSequence
  WHERE Year <= 2021
  ORDER BY bene_id, Year
);
sql amazon-athena presto trino
1个回答
0
投票

这里不需要使用递归,

sequence
+
unnest
应该可以解决问题。假设
y
(数据中的
y-2
)始终是“开始”年,如下所示:

-- sample data, a bit simplified
with dataset(id, year, y) as(
    values ('A', 2018, 2017),
           ('B', 2019, 2018)
)

-- query
select id, t.year
from dataset,
     unnest(sequence(y, 2021)) as t(year);

输出:

id
A 2017
A 2018
A 2019
A 2020
A 2021
B 2018
B 2019
B 2020
B 2021
© www.soinside.com 2019 - 2024. All rights reserved.