在sql表条目中查找重复单词[关闭]

问题描述 投票:1回答:2

我有一个名为“EntityName”和“entityid”的列。

 Entityid       EntityName
    1234        ABC inch EFG inch
    3456        inch* aaa inch vvv

任何人都可以给我查询以找到这些类型的重复单词。

sql sql-server tsql
2个回答
2
投票

您可以尝试以下方法:

DECLARE @DataSource TABLE
(   
    [EntityID] INT
   ,[Situation] VARCHAR(MAX)
);

INSERT INTO @DataSource ([EntityID], [Situation])
VALUES (1234, 'ABC inch EFG inch')
      ,(3456, 'inch aaa inch vvv')
      ,(1, 'only one inch');

DECLARE @Search VARCHAR(12) = 'inch';

SELECT *
FROM @DataSource
WHERE CHARINDEX(@Search, [Situation]) > 0
    AND CHARINDEX(@Search, STUFF([Situation], CHARINDEX(@Search, [Situation]), LEN(@Search), '')) > 0;

我们的想法是检查你的单词是否匹配,然后将其替换并检查是否还有其他匹配。

当然,这是非常简单的匹配。如果为了在T-SQL的上下文中获得正则表达式支持而实现SQL CLR函数,则可以添加更复杂的条件。


2
投票

如果您使用SQL Server 2017,您可以使用STRING_SPLIT尝试以下查询:

CREATE TABLE #TestData(Entityid int,Situation varchar(100))

INSERT #TestData(Entityid,Situation)VALUES
(1234,'ABC inch EFG inch'),
(3456,'inch aaa inch vvv'),
(7890,'BBBB aaa inch vvv')

SELECT *
FROM #TestData d
WHERE EXISTS(SELECT value FROM STRING_SPLIT(d.Situation,' ') WHERE value<>N'' GROUP BY value HAVING COUNT(*)>1)

DROP TABLE #TestData

你可以显示计数:

CREATE TABLE #TestData(Entityid int,Situation varchar(100))

INSERT #TestData(Entityid,Situation)VALUES
(1234,'ABC inch EFG inch'),
(3456,'inch aaa inch vvv aaa aaa'),
(7890,'BBBB aaa inch vvv')

SELECT
  *,
  (
    SELECT STRING_AGG(CONCAT(value,'*',cnt),', ')
    FROM
      (
        SELECT value,COUNT(*) cnt FROM STRING_SPLIT(d.Situation,' ') WHERE value<>N'' GROUP BY value HAVING COUNT(*)>1
      ) q
  ) DuplicatedWords
FROM #TestData d
WHERE EXISTS(SELECT value FROM STRING_SPLIT(d.Situation,' ') WHERE value<>N'' GROUP BY value HAVING COUNT(*)>1)

DROP TABLE #TestData

结果:

Entityid    Situation                    DuplicatedWords
1234        ABC inch EFG inch            inch*2
3456        inch aaa inch vvv aaa aaa    aaa*3, inch*2
© www.soinside.com 2019 - 2024. All rights reserved.