以下查询返回的结果如下所示:
SELECT
ProjectID, newID.value
FROM
[dbo].[Data] WITH(NOLOCK)
CROSS APPLY
STRING_SPLIT([bID],';') AS newID
WHERE
newID.value IN ('O95833', 'Q96NY7-2')
结果:
ProjectID value
---------------------
2 Q96NY7-2
2 O95833
2 O95833
2 Q96NY7-2
2 O95833
2 Q96NY7-2
4 Q96NY7-2
4 Q96NY7-2
使用新添加的
STRING_AGG
函数(在 SQL Server 2017 中),如以下查询所示,我可以获得下面的结果集。
SELECT
ProjectID,
STRING_AGG( newID.value, ',') WITHIN GROUP (ORDER BY newID.value) AS
NewField
FROM
[dbo].[Data] WITH(NOLOCK)
CROSS APPLY
STRING_SPLIT([bID],';') AS newID
WHERE
newID.value IN ('O95833', 'Q96NY7-2')
GROUP BY
ProjectID
ORDER BY
ProjectID
结果:
ProjectID NewField
-------------------------------------------------------------
2 O95833,O95833,O95833,Q96NY7-2,Q96NY7-2,Q96NY7-2
4 Q96NY7-2,Q96NY7-2
我希望我的最终输出仅包含以下独特元素:
ProjectID NewField
-------------------------------
2 O95833, Q96NY7-2
4 Q96NY7-2
关于如何得到这个结果有什么建议吗?如果需要,请随时从头开始完善/重新设计我的查询。
在子查询中使用
DISTINCT
关键字在合并结果之前删除重复项:SQL Fiddle
SELECT
ProjectID
,STRING_AGG(value, ',') WITHIN GROUP (ORDER BY value) AS
NewField
from (
select distinct ProjectId, newId.value
FROM [dbo].[Data] WITH(NOLOCK)
CROSS APPLY STRING_SPLIT([bID],';') AS newID
WHERE newID.value IN ( 'O95833' , 'Q96NY7-2' )
) x
GROUP BY ProjectID
ORDER BY ProjectID
这是我写的一个函数,用于回答 OP 标题: 欢迎改进!
CREATE OR ALTER FUNCTION [dbo].[fn_DistinctWords]
(
@String NVARCHAR(MAX)
)
RETURNS NVARCHAR(MAX)
WITH SCHEMABINDING
AS
BEGIN
DECLARE @Result NVARCHAR(MAX);
WITH MY_CTE AS ( SELECT Distinct(value) FROM STRING_SPLIT(@String, ' ') )
SELECT @Result = STRING_AGG(value, ' ') FROM MY_CTE
RETURN @Result
END
GO
使用类似:
SELECT dbo.fn_DistinctWords('One Two Three Two One');
您可以在用于
distinct
的子查询中使用 apply
:
SELECT d.ProjectID,
STRING_AGG( newID.value, ',') WITHIN GROUP (ORDER BY newID.value) AS
NewField
FROM [dbo].[Data] d CROSS APPLY
(select distinct value
from STRING_SPLIT(d.[bID], ';') AS newID
) newID
WHERE newID.value IN ( 'O95833' , 'Q96NY7-2' )
group by projectid;
这是我对@ttugates 的改进,使其更加通用:
CREATE OR ALTER FUNCTION [dbo].[fn_DistinctList]
(
@String NVARCHAR(MAX),
@Delimiter char(1)
)
RETURNS NVARCHAR(MAX)
WITH SCHEMABINDING
AS
BEGIN
DECLARE @Result NVARCHAR(MAX);
WITH MY_CTE AS ( SELECT Distinct(value) FROM STRING_SPLIT(@String,
@Delimiter) )
SELECT @Result = STRING_AGG(value, @Delimiter) FROM MY_CTE
RETURN @Result
END
正如 @SeanLange 在评论中指出的那样,这是提取数据的糟糕方法,但如果您不得不,只需将其设为 2 个单独的查询,如下所示:
SELECT
ProjectID
,STRING_AGG( val, ',') WITHIN GROUP (ORDER BY val) AS NewField
FROM
(
SELECT DISTINCT
ProjectID
,newID.value AS val
FROM
[dbo].[Data] WITH(NOLOCK)
CROSS APPLY STRING_SPLIT([bID],';') AS newID
WHERE
newID.value IN ('O95833' , 'Q96NY7-2')
) t
GROUP BY
ProjectID
应该可以了。
从
STRING_AGG
获取唯一字符串的另一种可能性是在获取逗号分隔的字符串后执行以下三个步骤:
STRING_SPLIT
)DISTINCT
STRING_AGG
应用于单个键上的组选择示例:
(select STRING_AGG(CAST(value as VARCHAR(MAX)), ',')
from (SELECT distinct 1 single_key, value
FROM STRING_SPLIT(STRING_AGG(CAST(customer_division as VARCHAR(MAX)), ','), ','))
q group by single_key) as customer_division
您可以创建表格的不同视图,其中包含聚合值,这甚至更简单:
Create Table Test (field1 varchar(1), field2 varchar(1));
go
Create View DistinctTest as (Select distinct field1, field2 from test group by field1,field2);
go
insert into Test Select 'A', '1';
insert into Test Select 'A', '2';
insert into Test Select 'A', '2';
insert into Test Select 'A', '2';
insert into Test Select 'D', '1';
insert into Test Select 'D', '1';
select string_agg(field1, ',') from Test where field2 = '1'; /* duplicates: A,D,D */;
select string_agg(field1, ',') from DistinctTest where field2 = '1'; /* no duplicates: A,D */;
如果您想在查询中包含其他聚合,您可以执行以下操作:
DROP TABLE IF EXISTS #data
CREATE TABLE #data (row_id INT IDENTITY(1,1), projectID INT, value NVARCHAR(40), cost FLOAT)
INSERT INTO #data(projectID, value, cost )
VALUES
(2,'Q96NY7-2',100)
,(2,'O95833' ,100)
,(2,'O95833' ,100)
,(2,'Q96NY7-2',100)
,(2,'O95833' ,100)
,(2,'Q96NY7-2',100)
,(4,'Q96NY7-2',100)
,(4,'Q96NY7-2',100)
SELECT projectID = d.projectID
, value = REPLACE(STRING_AGG(IIF(x.row_id = d.row_id, x.value, '(x)'),',') WITHIN GROUP (ORDER BY IIF(x.row_id = d.row_id, x.value, '(x)')), '(x),','')
, Cost = SUM(d.COST)
FROM #data d
JOIN ( SELECT DISTINCT projectid, value, row_id = MIN(row_id)
FROM #data
GROUP BY projectid, value
) x ON x.projectid = d.projectid AND x.value = d.value
GROUP BY d.projectID
项目ID | 价值 | 成本 |
---|---|---|
2 | O95833,Q96NY7-2 | 600 |
4 | Q96NY7-2 | 200 |
Oracle(自版本 19c 起)支持
listagg (DISTINCT ...
,但 Microsoft SQL Server 可能不支持。