使用Rank()的更新,不包含重复值的行号

问题描述 投票:0回答:1

我有一个与以下数据集相似的数据集。

enter image description here

我需要根据updated_CustomerId列中提供的值来更新基本查找表。基本表与数据集相同,但是没有updated_CustomerId列。

这里基本表具有唯一约束]的挑战,这是基于以下三列的组合:Current_CustomerIDOrder_IDOrderCategory

期望的输出:

enter image description here

更新后,可以将Old_customerIds(17360410-Pk 8,21044488-Pk = 9)中的一个重新分配给Update_CustomerID主键2不会更新,因为这会导致违反唯一性约束,但是它将与上面8或9中的一个主键一起删除,具体取决于哪个被更新(重新分配给新的ID)] >

在基表上的所有内容更新之后,我将从基表中删除未将Current_CustomerID重新分配给updated_CustomerId(如果有的话)的所有记录

    IF OBJECT_ID('tempdb..#DataSet') IS NOT NULL
    DROP TABLE #DataSet


    IF OBJECT_ID('tempdb..#BaseTable') IS NOT NULL
    DROP TABLE #BaseTable

CREATE TABLE #DataSet
(
    PrimaryKey INT NOT NULL CONSTRAINT [PK_dataset_ID] PRIMARY KEY,
    Current_CustomerID INT NOT NULL,
    Order_ID INT NOT NULL,
    OrderCategory VARCHAR(50) NOT NULL,
    Updated_CustomerId INT NOT NULL
)
INSERT INTO #DataSet (PrimaryKey,   Current_CustomerID, Order_ID,   OrderCategory,  updated_CustomerId)
VALUES
(1, 17395001,   4451784,    'Kitchen',  25693110),
(2, 25693110,   4451784,    'Kitchen',  25693110),

(3, 25693110,   2083059,    'Kitchen',  25693110),
(4, 25693110,   2163679,    'Kitchen',  25693110),
(5, 25693110,   2171466,    'Kitchen',  25693110),

(6, 25693110,   2163679,    'Bathroom', 25693110),
(7, 25693110,   2171466,    'Bathroom', 25693110),

(8, 17360410,   3377931,    'Furniture',    16303984),
(9, 21044488,   3377931,    'Furniture',    16303984),

(10,    1534323,    2641714,    'Furniture',    16303984),
(11,    16303984,   2641726,    'Furniture',    16303984),
(12,    16303984,   2641793,    'Furniture',    16303984),
(13,    16303984,   2641816,    'Furniture',    16303984),

(14,    16303345,   2641816,    'Garden',   16301239),
(15,    12345678,   1239065,    'Medicine', 1075432)

CREATE TABLE #BaseTable
(
    PrimaryKey INT NOT NULL CONSTRAINT [PK_baseTable_ID] PRIMARY KEY,
    CustomerID INT NOT NULL,
    Order_ID INT NOT NULL,
    OrderCategory VARCHAR(50) NOT NULL,
)

CREATE UNIQUE NONCLUSTERED INDEX [IDX_LookUp] ON #BaseTable
(
    CustomerID ASC,
    Order_ID ASC,
    OrderCategory ASC
) ON [PRIMARY]


INSERT INTO #BaseTable (PrimaryKey, CustomerID, Order_ID,   OrderCategory)
VALUES
(1, 17395001,   4451784,    'Kitchen'),
(2, 25693110,   4451784,    'Kitchen'),

(3, 25693110,   2083059,    'Kitchen'),
(4, 25693110,   2163679,    'Kitchen'),
(5, 25693110,   2171466,    'Kitchen'),

(6, 25693110,   2163679,    'Bathroom'),
(7, 25693110,   2171466,    'Bathroom'),

(8, 17360410,   3377931,    'Furniture'),
(9, 21044488,   3377931,    'Furniture'),

(10,    1534323,    2641714,    'Furniture'),
(11,    16303984,   2641726,    'Furniture'),
(12,    16303984,   2641793,    'Furniture'),
(13,    16303984,   2641816,    'Furniture'),

(14,    16303345,   2641816,    'Garden'),
(15,    12345678,   1239065,    'Medicine')

-- select * from #BaseTable
-- select * from #DataSet


update b
set CustomerID = a.Updated_CustomerId
from #BaseTable b
inner join #DataSet a on b.PrimaryKey = a.PrimaryKey

Msg 2601,第14级,状态1,第82行无法在具有唯一索引“ IDX_LookUp”的对象“ dbo。#BaseTable”中插入重复的键行。重复的键值为(16303984,3377931,Furniture)。该语句已终止。

我有一个与以下数据集相似的数据集。我需要根据updated_CustomerId列中提供的值来更新基本查找表。基本表与数据集相同,但它确实...

sql-server duplicates sql-update ranking-functions
1个回答
0
投票

我认为您只想获取#DataTable的row_number,然后根据唯一键删除多于一个的行:

//...

DELETE  bt
FROM    #BaseTable bt
    INNER JOIN (
                SELECT  a.PrimaryKey,
                    a.Updated_CustomerId,
                    a.Order_ID,
                    a.OrderCategory,
                    row = ROW_NUMBER() OVER (PARTITION BY a.Updated_CustomerId, a.Order_ID, a.OrderCategory ORDER BY a.Current_CustomerID)
                FROM    #BaseTable b
                    INNER JOIN #DataSet a
                        ON b.PrimaryKey = a.PrimaryKey
            ) x
        ON bt.PrimaryKey = x.PrimaryKey
        AND x.row > 1
© www.soinside.com 2019 - 2024. All rights reserved.