我有这两张表:
CREATE TABLE my_table1 (
name VARCHAR(50),
var1 DATE,
var2 INT
);
INSERT INTO my_table1 (name, var1, var2) VALUES
('john', '2010-01-01', 94),
('john', '2010-01-04', 106),
('john', '2015-01-01', 99),
('alex', '2010-01-01', 96),
('alex', '2018-01-01', 96),
('sara', '2005-01-01', 94),
('sara', '2006-01-01', 90),
('tim', '1999-01-01', 101);
CREATE TABLE my_table2 (
name VARCHAR(50),
var3 DATE,
var4 CHAR(1)
);
INSERT INTO my_table2 (name, var3, var4) VALUES
('john', '2001-01-01', 'a'),
('john', '2002-01-01', 'b'),
('alex', '2021-01-01', 'c'),
('alex', '2022-01-01', 'd'),
('sara', '1999-01-01', 'e'),
('sara', '2023-01-01', 'f');
我正在尝试回答这个问题:
问题 1:对于 my_table2 中的每个名称,找到最近的行(基于日期)。将此行加入到 my_table1。但是,在连接之后 - 确保 my_table2 中的日期大于 my_table1 中的日期(如果不是,则删除)。最终结果应与 my_table1 具有相同的行数。
对于问题1,我尝试这样解决问题:
# problem 1
SELECT t1.*, t2.*
FROM my_table1 t1
JOIN (
SELECT name, var3, var4
FROM (
SELECT name, var3, var4,
ROW_NUMBER() OVER (PARTITION BY name ORDER BY var3 DESC) as rn
FROM my_table2
) tmp
WHERE rn = 1
) t2
ON t1.name = t2.name
WHERE t1.var1 < t2.var3;
代码部分产生了正确的输出,但是我不知道如何在此处包含 John 和 Tim 的 NA 行
有人可以告诉我如何正确执行此操作吗?
谢谢!
注意:我尝试了使用 COALESCE 函数的方法。这为 Tim 生成了一行,但不会为 John 生成了一行:
SELECT t1.*, COALESCE(t2.name, 'NA') as name, t2.var3, t2.var4
FROM my_table1 t1
LEFT JOIN (
SELECT name, var3, var4
FROM (
SELECT name, var3, var4,
ROW_NUMBER() OVER (PARTITION BY name ORDER BY var3 DESC) as rn
FROM my_table2
) tmp
WHERE rn = 1
) t2
ON t1.name = t2.name
WHERE t1.var1 < COALESCE(t2.var3, '9999-12-31');
如果在 left join 子句中添加 where 子句,就可以得到预期的结果:
SELECT t1.*, COALESCE(t2.name, 'NA') as name, t2.var3, t2.var4
FROM my_table1 t1
LEFT JOIN (
SELECT name, var3, var4
FROM (
SELECT name, var3, var4,
ROW_NUMBER() OVER (PARTITION BY name ORDER BY var3 DESC) as rn
FROM my_table2
) tmp
WHERE rn = 1
) t2
ON t1.name = t2.name and t1.var1 < COALESCE(t2.var3, '9999-12-31');
这将为您提供如下输出:
name var1 var2 name var3 var4
john 2010-01-01 94 NA null null
john 2010-01-04 106 NA null null
john 2015-01-01 99 NA null null
alex 2010-01-01 96 alex 2022-01-01 d
alex 2018-01-01 96 alex 2022-01-01 d
sara 2005-01-01 94 sara 2023-01-01 f
sara 2006-01-01 90 sara 2023-01-01 f
tim 1999-01-01 101 NA null null
希望这是您所需要的。