我有两个可能的唯一标识符(ID1和ID2)的表。每一行都会有一个或两个这些标识符。每一行中的数据是完全针对每个ID是相同的,与时间戳的异常。我想,以消除每个值重复,而是把空值作为唯一的值。
这个问题:How to delete duplicate rows in sql server?
叫我去这个网站:http://www.codaffection.com/sql-server-article/delete-duplicate-rows-in-sql-server/
在那里,我想出了以下查询:
WITH CTE AS
(
SELECT *,ROW_NUMBER() OVER (PARTITION BY ID1 ORDER BY ID1) AS RN
FROM Filings_Search
)
DELETE FROM CTE WHERE RN<>1
不幸的是,删除了所有我的空值的!我怎么能修改此查询,以避免删除空值?
编辑:这是一个什么我的数据会看起来像(如果有谁知道如何格式化表格好了,让我知道我用https://senseful.github.io/text-table/)的样本。
+------+------+----------+-----------+
| ID1 | ID2 | Data | Timestamp |
+------+------+----------+-----------+
| NULL | abc | macd | 01:40 |
| NULL | abc | macd | 04:23 |
| NULL | def | pfchangs | 01:41 |
| 123 | NULL | wendys | 02:42 |
| 123 | NULL | wendys | 03:45 |
+------+------+----------+-----------+
运行在ID1将输出:
+------+------+----------+-----------+
| ID1 | ID2 | Data | Timestamp |
+------+------+----------+-----------+
| NULL | abc | macd | 01:40 |
| NULL | abc | macd | 04:23 |
| NULL | def | pfchangs | 01:41 |
| 123 | NULL | wendys | 02:42 |
+------+------+----------+-----------+
运行在ID2将输出:
+------+------+----------+-----------+
| ID1 | ID2 | Data | Timestamp |
+------+------+----------+-----------+
| NULL | abc | macd | 01:40 |
| NULL | def | pfchangs | 01:41 |
| 123 | NULL | wendys | 02:42 |
| 123 | NULL | wendys | 03:45 |
+------+------+----------+-----------+
道歉,如果这是一个重复的,我是一个SQL初学者,但没有找到任何东西正好就是我一直在寻找。
关于什么:
DELETE FROM CTE
WHERE RN<>1
AND ID1 IS NOT NULL
使用ID2,和数据在由分隔
WITH CTE AS (
SELECT f.*, ROW_NUMBER() OVER (PARTITION BY ID2,data ORDER BY Timestamp ) AS RN
FROM Filings_Search
)
DELETE FROM CTE WHERE RN<>1