SQL:满足条件时返回行号

问题描述 投票:0回答:1

我正在使用 Netezza SQL。

我有下表:

  name year var1 var2
  John 2001    a    b
  John 2002    a    a
  John 2003    a    b
  Mary 2001    b    a
  Mary 2002    a    b
  Mary 2003    b    a
 Alice 2001    a    b
 Alice 2002    b    a
 Alice 2003    a    b
   Bob 2001    b    a
   Bob 2002    b    b
   Bob 2003    b    a

我想回答以下问题:

  • 对于每个名称,var1 何时(即哪个 row_num)第一次更改?保留该行的完整信息,以便我们可以看到 var1_before/var1_after 和 var2_before/var2_after 的变化
  • 如果名称始终保留其 var1 值 - 返回与该名称相对应的最后一个可用年份的完整信息行(以及 row_number)

我编写这段代码是为了查看每个人的 var1 和 var2 每年如何变化:

WITH CTE AS (
    SELECT 
        name, 
        year, 
        var1, 
        var2,
        LAG(var1, 1) OVER (PARTITION BY name ORDER BY year ASC) AS var1_before,
        LEAD(var1, 1) OVER (PARTITION BY name ORDER BY year ASC) AS var1_after,
        LAG(var2, 1) OVER (PARTITION BY name ORDER BY year ASC) AS var2_before,
        LEAD(var2, 1) OVER (PARTITION BY name ORDER BY year ASC) AS var2_after,
        ROW_NUMBER() OVER (PARTITION BY name ORDER BY year ASC) AS row_num
    FROM 
        mytable
)
SELECT 
  *
FROM 
    CTE;

但我不知道如何从这里开始。我试图识别有变化的名称与没有变化的名称......但我总是感到困惑和混乱。

有人可以告诉我如何做到这一点吗?

谢谢!

sql netezza
1个回答
0
投票

为了解决您的问题,我们需要使用额外的逻辑来扩展您已经创建的公共表表达式 (CTE),以检测每个名称的 var1 首次更改的时间,并处理 var1 在整个过程中没有更改的情况时期。这涉及在 SQL 中使用额外的窗口函数和条件逻辑。

一些注意事项:

  • 使用 LAG 函数将当前 var1 值与 上一年的 var1 值。您已经完成了这部分。
  • 添加一列来标记 var1 更改的行。
  • 对于每个名称,找到 var1 发生变化的第一行。
  • 对于 var1 不变的名称,我们将返回 最后可用年份。

代码:

WITH CTE AS (
    SELECT 
        name, 
        year, 
        var1, 
        var2,
        LAG(var1) OVER (PARTITION BY name ORDER BY year) AS var1_before,
        LEAD(var1) OVER (PARTITION BY name ORDER BY year) AS var1_after,
        LAG(var2) OVER (PARTITION BY name ORDER BY year) AS var2_before,
        LEAD(var2) OVER (PARTITION BY name ORDER BY year) AS var2_after,
        ROW_NUMBER() OVER (PARTITION BY name ORDER BY year) AS row_num,
        CASE 
            WHEN LAG(var1) OVER (PARTITION BY name ORDER BY year) IS NOT NULL AND 
                 LAG(var1) OVER (PARTITION BY name ORDER BY year) != var1 THEN 1
            ELSE 0 
        END AS var1_changed_flag
    FROM 
        mytable
),
RankedChanges AS (
    SELECT 
        *,
        ROW_NUMBER() OVER (PARTITION BY name, var1_changed_flag ORDER BY year) AS change_rank
    FROM 
        CTE
    WHERE 
        var1_changed_flag = 1 OR 
        var1_after IS NULL -- This condition helps to include the last row for each name
)
SELECT 
    *
FROM 
    RankedChanges
WHERE 
    change_rank = 1 OR
    (var1_after IS NULL AND var1_changed_flag = 0) -- Select the last row if var1 never changed
ORDER BY 
    name, 
    year;
© www.soinside.com 2019 - 2024. All rights reserved.