我有每个 ID 的每月累计计数器,但在某些月份计数器会丢失。我想用最近的前一个和最近的后一个非空行之间插值的值替换这些
null
条目。
我的桌子是这样的:
身份证 | 月 | 柜台 |
---|---|---|
AAA | 2023-09 | 1000 |
AAA | 2023-10 | - |
AAA | 2023-11 | - |
AAA | 2023-12 | 4000 |
BBB | 2022-11 | 2000 |
BBB | 2022-12 | - |
BBB | 2023-01 | - |
BBB | 2023-02 | - |
BBB | 2023-03 | 4000 |
我想要什么:
身份证 | 月 | 柜台 |
---|---|---|
AAA | 2023-09 | 1000 |
AAA | 2023-10 | 2000 |
AAA | 2023-11 | 3000 |
AAA | 2023-12 | 4000 |
BBB | 2022-11 | 2000 |
BBB | 2022-12 | 2500 |
BBB | 2023-01 | 3000 |
BBB | 2023-02 | 3500 |
BBB | 2023-03 | 4000 |
如何在 PostgreSQL 中完成此操作?
您通常可以使用窗口函数来计算 Counter 的缺失值,如以下查询所示:
SELECT id
, month
, counter
, first_value(counter) OVER w
+ percent_rank() OVER w
* ( last_value(counter) OVER w - first_value(counter) OVER w) AS new_counter
FROM test
WINDOW w AS (PARTITION BY id ORDER BY month RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
ORDER BY id, month
结果:
id | 月 | 柜台 | 新计数器 |
---|---|---|---|
AAA | 2023-09-01 | 1000 | 1000 |
AAA | 2023-10-01 | 空 | 2000 |
AAA | 2023-11-01 | 空 | 3000 |
AAA | 2023-12-01 | 4000 | 4000 |
BBB | 2022-11-01 | 2000 | 2000 |
BBB | 2022-12-01 | 空 | 2500 |
BBB | 2023-01-01 | 空 | 3000 |
BBB | 2023-02-01 | 空 | 3500 |
BBB | 2023-03-01 | 4000 | 4000 |
如果您想更新表,则可以在 UPDATE 函数中使用此查询:
With query AS
( SELECT id
, month
, first_value(counter) OVER w
+ percent_rank() OVER w
* ( last_value(counter) OVER w - first_value(counter) OVER w) AS new_counter
FROM test
WINDOW w AS (PARTITION BY id ORDER BY month RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
)
UPDATE test AS t
SET counter = q.new_counter
FROM query AS q
WHERE t.id = q.id
AND t.month = q.month
AND counter IS null ;
请参阅 dbfiddle
中的演示