我有一桌水果:
name1 name2 year1 year2
apple pear 2010 2001
apple pear 2011 2002
pear apple 2010 2003
pear apple 2011 2004
apple null 2009 2005
pear orange 2008 2006
apple pear 2010 2007
apple grape 2010 2008
问题:每年1,我只希望名字出现一次......例如苹果梨 2010 与梨苹果 2010 相同。也就是说,当存在重复项时...我只想保留每个重复项的第一次出现(例如第一次出现)
我认为正确的输出应该是这样的:
name1 name2 year1 year2
apple pear 2010 2001
apple pear 2011 2002
apple null 2009 2005
pear orange 2008 2006
apple grape 2010 2008
我尝试了以下代码:
SELECT
name1,
name2,
year1,
year2,
ROW_NUMBER() OVER (PARTITION BY name1, name2, year1 ORDER BY year2) AS rn
FROM
fruits
)
SELECT
name1,
name2,
year1,
year2
FROM
ranked_names
WHERE
rn = 1;
但这并没有产生正确的结果:
name1 name2 year1 year2
apple grape 2010 2008
apple null 2009 2005
apple pear 2010 2001
apple pear 2011 2002
pear apple 2010 2003
pear apple 2011 2004
pear orange 2008 2006
例如(apple pear 2010 2001) 和 ( pear apple 2010 2003) 出现两次,即使只应出现其中一个。
有人可以告诉我如何纠正这个问题吗?
谢谢!
替代想法?
WITH ranked_names AS (
SELECT
name1,
name2,
year1,
year2,
ROW_NUMBER() OVER (PARTITION BY year1,
CASE WHEN name1 < name2 THEN name1 ELSE name2 END,
CASE WHEN name1 < name2 THEN name2 ELSE name1 END
ORDER BY year2) AS rn
FROM
fruits
)
SELECT
name1,
name2,
year1,
year2
FROM
ranked_names
WHERE
rn = 1;
WITH ranked_names AS (
SELECT
name1,
name2,
year1,
year2,
ROW_NUMBER() OVER (PARTITION BY LEAST(name1, name2), GREATEST(name1, name2), year1 ORDER BY year2) AS rn
FROM
fruits
)
SELECT
name1,
name2,
year1,
year2
FROM
ranked_names
WHERE
rn = 1;
演示:https://dbfiddle.uk/dQk17bSS
浏览 SQL 中的最小和最大概念。