按一年中的 52 周透视数据,并使用之前的值填充缺失值

问题描述 投票:0回答:1

我试图将一年中的数据值按 52 周进行透视(在示例中我只使用了 13 周),只要有空白值,我就想用以前的值填充它。

测试数据:

CREATE TABLE testdata (
    [year] INT,
    id1 VARCHAR(10),
    id2 VARCHAR(10),
    [week] INT,
    date1 DATE,
    date2 DATE
);

INSERT INTO testdata ([year], id1, id2, [week], date1, date2) VALUES
(2023, 'p1', 'p001', 1, '2023-01-01', '2023-01-01'),
(2023, 'p2', 'p002', 4, '2023-01-02', '2023-01-02'),
(2023, 'p3', 'p003', 7, '2023-01-03', '2023-01-03'),
(2023, 'p4', 'p004', 10, '2023-01-04', '2023-01-04'),
(2023, 'p5', 'p005', 13, '2023-01-05', '2023-01-05'),
(2024, 'p1', 'p001', 1, '2024-02-01', '2024-02-01'),
(2024, 'p2', 'p002', 2, '2024-02-02', '2024-02-02'),
(2024, 'p3', 'p003', 7, '2024-02-03', '2024-02-03'),
(2024, 'p4', 'p004', 10, '2024-02-04', '2024-02-04'),
(2024, 'p5', 'p005', 13, '2024-02-05', '2024-02-05');

我已经进行了多次尝试,但无法用以前的值填充空白。

WITH WeeklyData AS (
  SELECT
    [year],
    [week],
    date1
  FROM testdata
)
, PivotedData AS (
  SELECT
    [year],
    MAX(CASE WHEN [week] = 1 THEN date1 END) AS week_1_date1,
    MAX(CASE WHEN [week] = 2 THEN date1 END) AS week_2_date1,
    MAX(CASE WHEN [week] = 3 THEN date1 END) AS week_3_date1,
    MAX(CASE WHEN [week] = 4 THEN date1 END) AS week_4_date1,
    MAX(CASE WHEN [week] = 5 THEN date1 END) AS week_5_date1,
    MAX(CASE WHEN [week] = 6 THEN date1 END) AS week_6_date1,
    MAX(CASE WHEN [week] = 7 THEN date1 END) AS week_7_date1,
    MAX(CASE WHEN [week] = 8 THEN date1 END) AS week_8_date1,
    MAX(CASE WHEN [week] = 9 THEN date1 END) AS week_9_date1,
    MAX(CASE WHEN [week] = 10 THEN date1 END) AS week_10_date1,
    MAX(CASE WHEN [week] = 11 THEN date1 END) AS week_11_date1,
    MAX(CASE WHEN [week] = 12 THEN date1 END) AS week_12_date1,
    MAX(CASE WHEN [week] = 13 THEN date1 END) AS week_13_date1
  FROM WeeklyData
  GROUP BY [year]
)
SELECT
  [year],
  COALESCE(week_1_date1, LAG(week_1_date1) OVER (ORDER BY [year])) AS week_1_date1,
  COALESCE(week_2_date1, LAG(week_2_date1) OVER (ORDER BY [year])) AS week_2_date1,
  COALESCE(week_3_date1, LAG(week_3_date1) OVER (ORDER BY [year])) AS week_3_date1,
  COALESCE(week_4_date1, LAG(week_4_date1) OVER (ORDER BY [year])) AS week_4_date1,
  COALESCE(week_5_date1, LAG(week_5_date1) OVER (ORDER BY [year])) AS week_5_date1,
  COALESCE(week_6_date1, LAG(week_6_date1) OVER (ORDER BY [year])) AS week_6_date1,
  COALESCE(week_7_date1, LAG(week_7_date1) OVER (ORDER BY [year])) AS week_7_date1,
  COALESCE(week_8_date1, LAG(week_8_date1) OVER (ORDER BY [year])) AS week_8_date1,
  COALESCE(week_9_date1, LAG(week_9_date1) OVER (ORDER BY [year])) AS week_9_date1,
  COALESCE(week_10_date1, LAG(week_10_date1) OVER (ORDER BY [year])) AS week_10_date1,
  COALESCE(week_11_date1, LAG(week_11_date1) OVER (ORDER BY [year])) AS week_11_date1,
  COALESCE(week_12_date1, LAG(week_12_date1) OVER (ORDER BY [year])) AS week_12_date1,
  COALESCE(week_13_date1, LAG(week_13_date1) OVER (ORDER BY [year])) AS week_13_date1
FROM PivotedData
ORDER BY [year];

预期结果:红色的值是必须插入空白字段的值。

如果您有任何建议,我们将不胜感激。

sql-server t-sql pivot
1个回答
0
投票

您可以通过以下方式实现您的目标:

  1. 确定所需的年份和周范围。
  2. 生成涵盖这些年份和周范围的日历表。新的
    GENERATE_SERIES()
    功能对此有很大帮助。
  3. 将生成的日历与源数据左连接,并计算每年和每周组合的
    MAX(date1)
    值。这可能为空。
  4. 使用
    LAST_VALUE() window function with the 
    IGNORE NULLS` 选项用最新的先前可用值填充任何空值。
  5. 使用 条件聚合
    PIVOT
    将数据转换为最终的列形式。

请注意,`LAST_VALUE() 窗口函数的

GENERATE_SERIES()
函数和
IGNORE NULLS
选项都是 SQL Server 2022 的新功能。

结果将类似于以下内容(使用条件聚合):

SELECT
    S.year,
    MAX(CASE WHEN S.[week] = 1 THEN S.filled_date1 END) AS week_1_date1,
    MAX(CASE WHEN S.[week] = 2 THEN S.filled_date1 END) AS week_2_date1,
    MAX(CASE WHEN S.[week] = 3 THEN S.filled_date1 END) AS week_3_date1,
    MAX(CASE WHEN S.[week] = 4 THEN S.filled_date1 END) AS week_4_date1,
    MAX(CASE WHEN S.[week] = 5 THEN S.filled_date1 END) AS week_5_date1,
    MAX(CASE WHEN S.[week] = 6 THEN S.filled_date1 END) AS week_6_date1,
    MAX(CASE WHEN S.[week] = 7 THEN S.filled_date1 END) AS week_7_date1,
    MAX(CASE WHEN S.[week] = 8 THEN S.filled_date1 END) AS week_8_date1,
    MAX(CASE WHEN S.[week] = 9 THEN S.filled_date1 END) AS week_9_date1,
    MAX(CASE WHEN S.[week] = 10 THEN S.filled_date1 END) AS week_10_date1,
    MAX(CASE WHEN S.[week] = 11 THEN S.filled_date1 END) AS week_11_date1,
    MAX(CASE WHEN S.[week] = 12 THEN S.filled_date1 END) AS week_12_date1,
    MAX(CASE WHEN S.[week] = 13 THEN S.filled_date1 END) AS week_13_date1
FROM (
    SELECT
        S.year, S.week,
        LAST_VALUE(S.MaxDate1) IGNORE NULLS OVER(ORDER BY S.year, S.week) AS filled_date1
    FROM (
        SELECT Y.value AS year, W.value as week, MAX(T.date1) AS MaxDate1
        FROM (
            SELECT
                MIN(T.year) AS MinYear,
                MAX(T.year) AS MaxYear,
                GREATEST(MAX(T.week), 13) AS MaxWeek  -- 13 weeks for demo purposes
            FROM testdata T
        ) RNG
        CROSS APPLY GENERATE_SERIES(RNG.MinYear, RNG.MaxYear) Y
        CROSS APPLY GENERATE_SERIES(1, RNG.MaxWeek) W
        LEFT JOIN testdata T
            ON T.year = Y.value
            AND T.week = W.value
        GROUP BY Y.value, W.value
    ) S
) S
GROUP BY S.year
ORDER BY S.year

或以下内容(使用

PIVOT
):

SELECT
    PVT.year,
    PVT.[1] AS week_1_date1,
    PVT.[2] AS week_2_date1,
    PVT.[3] AS week_3_date1,
    PVT.[4] AS week_4_date1,
    PVT.[5] AS week_5_date1,
    PVT.[6] AS week_6_date1,
    PVT.[7] AS week_7_date1,
    PVT.[8] AS week_8_date1,
    PVT.[9] AS week_9_date1,
    PVT.[10] AS week_10_date1,
    PVT.[11] AS week_11_date1,
    PVT.[12] AS week_12_date1,
    PVT.[13] AS week_13_date1
FROM (
    SELECT
        S.year, S.week,
        LAST_VALUE(S.MaxDate1) IGNORE NULLS OVER(ORDER BY S.year, S.week) AS FilledDate1
    FROM (
        SELECT Y.value AS year, W.value as week, MAX(T.date1) AS MaxDate1
        FROM (
            SELECT
                MIN(T.year) AS MinYear,
                MAX(T.year) AS MaxYear,
                GREATEST(MAX(T.week), 13) AS MaxWeek  -- 13 weeks for demo purposes
            FROM testdata T
        ) RNG
        CROSS APPLY GENERATE_SERIES(RNG.MinYear, RNG.MaxYear) Y
        CROSS APPLY GENERATE_SERIES(1, RNG.MaxWeek) W
        LEFT JOIN testdata T
            ON T.year = Y.value
            AND T.week = W.value
        GROUP BY Y.value, W.value
    ) S
) S
PIVOT (
    MAX(S.FilledDate1)
    FOR week IN (  -- 13 weeks for demo purposes
        [1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12], [13]
    )
) PVT
ORDER BY PVT.year

结果(额外一行显示结转到新的一年):

week_1_date1 week_2_date1 week_3_date1 week_4_date1 week_5_date1 week_6_date1 week_7_date1 week_8_date1 week_9_date1 week_10_date1 week_11_date1 week_12_date1 week_13_date1
2023 2023-01-01 2023-01-01 2023-01-01 2023-01-02 2023-01-02 2023-01-02 2023-01-03 2023-01-03 2023-01-03 2023-01-04 2023-01-04 2023-01-04 2023-01-05
2024 2024-02-01 2024-02-02 2024-02-02 2024-02-02 2024-02-02 2024-02-02 2024-02-03 2024-02-03 2024-02-03 2024-02-04 2024-02-04 2024-02-04 2024-02-05
2025 2024-02-05 2024-02-05 2025-03-01 2025-03-01 2025-03-01 2025-03-01 2025-03-01 2025-03-01 2025-03-01 2025-03-01 2025-03-01 2025-03-01 2025-03-01

请参阅 此数据库<>fiddle 进行演示

© www.soinside.com 2019 - 2024. All rights reserved.