获取附加信息的重复项

问题描述 投票:0回答:4

我继承了一个数据库,我在构建一个有效的SQL查询时遇到了麻烦。

假设这是数据:

[Products]

| Id    | DisplayId     | Version   | Company   | Description   |
|----   |-----------    |---------- |-----------| -----------   |
| 1     | 12345         | 0         | 16        | Random        |
| 2     | 12345         | 0         | 2         | Random 2      |
| 3     | AB123         | 0         | 1         | Random 3      |
| 4     | 12345         | 1         | 16        | Random 4      |
| 5     | 12345         | 1         | 2         | Random 5      |
| 6     | AB123         | 0         | 5         | Random 6      |
| 7     | 12345         | 2         | 16        | Random 7      |
| 8     | XX45          | 0         | 5         | Random 8      |
| 9     | XX45          | 0         | 7         | Random 9      |
| 10    | XX45          | 1         | 5         | Random 10     |
| 11    | XX45          | 1         | 7         | Random 11     |


[Companies]

| Id    | Code      |
|----   |-----------|
| 1     | 'ABC'     |
| 2     | '456'     |
| 5     | 'XYZ'     |
| 7     | 'XYZ'     |
| 16    | '456'     |

Versioncolumn是版本号。数字越大表示更新的版本。 Company列是引用Companies列上的Id表的外键。还有另一张名为ProductData的表格,其中ProductId列引用了Products.Id

现在我需要找到基于DisplayId和相应的Companies.Code的重复项。应该连接ProductData表以显示标题(ProductData.Title),并且只有最新的表应该包含在结果中。所以预期的结果是:

| Id    | DisplayId     | Version   | Company   | Description   | ProductData.Title |
|----   |-----------    |---------- |-----------|-------------  |------------------ |
| 5     | 12345         | 1         | 2         | Random 2      | Title 2           |
| 7     | 12345         | 2         | 16        | Random 7      | Title 7           |
| 10    | XX45          | 1         | 5         | Random 10     | Title 10          |
| 11    | XX45          | 1         | 7         | Random 11     | Title 11          |
  • 因为XX45有2个“条目”:一个与公司5,一个与公司7,但两个公司共享相同的代码。
  • 因为12345有2个“条目”:一个与公司2,一个与公司16,但两家公司共享相同的代码。请注意,两者的最新版本不同(公司16的条目版本2和公司2条目的版本1)
  • 不应包括ABC123,因为其2个条目具有不同的公司代码。

我渴望了解你的见解......

sql-server duplicates
4个回答
0
投票

试试这个:

SELECT b.ID,displayid,version,company,productdata.title
FROM 
(select A.ID,a.displayid,version,a.company,rn,a.code, COUNT(displayid)  over (partition by displayid,code) cnt from
(select Prod.ID,displayid,version,company,Companies.code, Row_number() over (partition by displayid,company order by version desc) rn
from Prod inner join Companies on Prod.Company = Companies.id) a  
where a.rn=1) b inner join productdata on b.id = productdata.id  where cnt =2

1
投票

根据您的示例数据,您只需要JOIN表:

  SELECT 
    p.Id, p.DisplayId, p.Version, p.Company, d.Title
  FROM Products AS p
  INNER JOIN Companies AS c ON p.Company = c.Id
  INNER JOIN ProductData AS d ON d.ProductId = p.Id;

但如果你想要最新版本,你可以使用ROW_NUMBER()

WITH CTE
AS
(
  SELECT 
    p.Id, p.DisplayId, p.Version, p.Company, d.Title,
    ROW_NUMBER() OVER(PARTITION BY p.DisplayId,p.Company ORDER BY p.Id DESC) AS RN
  FROM Products AS p
  INNER JOIN Companies AS c ON p.Company = c.Id
  INNER JOIN ProductData AS d ON d.ProductId = p.Id
)
SELECT * 
FROM CTE
WHERE RN = 1;

sample fiddle

| Id | DisplayId | Version | Company |    Title |
|----|-----------|---------|---------|----------|
|  5 |     12345 |       1 |       2 |  Title 5 |
|  7 |     12345 |       2 |      16 |  Title 7 |
| 10 |      XX45 |       1 |       5 | Title 10 |
| 11 |      XX45 |       1 |       7 | Title 11 |

1
投票

如果我理解正确,您可以使用CTE从表中查找所有重复的行,然后您可以使用CTE中的SELECT甚至添加更多操作。

WITH CTE AS(
   SELECT Id,DisplayId,Version,Company,Description,ProductData.Title
       RN = ROW_NUMBER()OVER(PARTITION BY DisplayId, Company ORDER BY p.Id DESC)
   FROM dbo.YourTable1
)

SELECT *
FROM CTE

0
投票

您必须先获取当前版本,然后才能看到DisplayID + Code显示的次数。然后基于此,您只能选择计数大于1的那些。然后,您可以在最终查询中INNER JOIN ProductData以获取标题。

WITH
MaxVersion AS --Get the current versions
(
    SELECT
        MAX(Version) AS Version,
        DisplayID,
        Company
    FROM
        #TmpProducts
    GROUP BY
        DisplayID,
        Company
)
,CTE AS
(
    SELECT
        p.DisplayID,
        c.Code,
        COUNT(*) AS RowCounter
    FROM
        #TmpProducts p
    INNER JOIN
        #TmpCompanies c
        ON
            c.ID = p.Company
    INNER JOIN
        MaxVersion mv
        ON
            mv.DisplayID = p.DisplayID
        AND mv.Version = p.Version
        AND mv.Company = p.Company
    GROUP BY
        p.DisplayID,
        c.Code
)

SELECT 
    p.*
FROM
    #TmpProducts p
INNER JOIN
    CTE c
    ON
        c.DisplayID = p.DisplayID
INNER JOIN
    MaxVersion mv
    ON
        mv.DisplayID = p.DisplayID
    AND mv.Company = p.Company
    AND mv.Version = p.Version
WHERE
    c.RowCounter > 1
© www.soinside.com 2019 - 2024. All rights reserved.