如何分析带有多个下划线和破折号的字符串

问题描述 投票:0回答:2

我需要一些指导,并寻求更好的解析字符串的方法。

示例1:

String:Automation_LOY_Loyalty_PC_CampaignName3-Abandoners-Email1_NoPromo_USA

CSV:自动化,忠诚度,忠诚度,PC,CampaignName3,放弃者,电子邮件1,NoPromo,美国

示例2:

String:20200601_LOY_Functional_PC_CampaignName1_NoPromo_CAN

CSV:20200601,LOY,Functional,PC,CampaignName1 ,,, No Promo,CAN

如您所见,某些字符串没有所有字段,因此某些csv字段需要留空。

目前,我正在下面使用此代码,它非常混乱。除了像这样使用CHARINDEX之外,还有其他更好的方法来处理它吗?

SELECT [EmailName]
      -- Deployment Type
      ,CASE LEFT([EmailName], (CHARINDEX('_', [EmailName])) - 1)
        WHEN 'Automation' THEN 'Automation' 
        ELSE 'AdHoc' END AS [Deployment]
      -- Type
      ,SUBSTRING([EmailName], (CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1)) + 1, CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1) + 1) - CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1) - 1) AS [Type]
      -- Customer_Type
      ,SUBSTRING([EmailName], CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1) + 1) + 1, CHARINDEX('_', [EmailName],CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1)) + 1) + 1)) - CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1)) + 1) - 1) AS [Customer_Type]
      -- Campaign_Name
      ,CASE WHEN (CHARINDEX('-', [EmailName])) = 0 THEN SUBSTRING([EmailName], (CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1)) + 1) + 1))) + 1, (CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1)) + 1) + 1)) + 1))) - (CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1)) + 1) + 1))) - 1)
            WHEN (CHARINDEX('-', [EmailName])) > 0 THEN SUBSTRING([EmailName], (CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1)) + 1) + 1))) + 1, ((CHARINDEX('-', [EmailName])) - (CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1)) + 1) + 1)))) - 1)
            ELSE NULL END AS [Campaign_Name]
      ,CASE WHEN (CHARINDEX('-', [EmailName])) = 0 THEN 1
            WHEN (CHARINDEX('-', [EmailName])) > 0 THEN REPLACE(SUBSTRING([EmailName], ((CHARINDEX('-', [EmailName])) + 1), (CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1)) + 1) + 1)) + 1))) - (CHARINDEX('-', [EmailName])) - 1),'Email','')
            ELSE NULL END AS [Email_Num]
FROM TableName
sql tsql
2个回答
1
投票

我不会完全回答这个问题,但是无论如何都会给你一些线索。在SQL Server中,可以使用STRING_SPLIT函数分割字符串。但是它仅接受一个定界符。由于您似乎想在连字符(-)和下划线(_)上进行拆分,因此可以做的是:

  • 将所有连字符替换为下划线
  • 然后分割字符串

示例:

declare @str varchar(500)
set @str = 'Automation_LOY_Loyalty_PC_CampaignName3-Abandoners-Email1_NoPromo_USA'
set @str = REPLACE(@str, '-', '_')

SELECT 
    value  
FROM 
    STRING_SPLIT(@str,'_')

结果(9行):

自动化Y忠诚个人电脑CampaignName3放弃者电邮1无促销美国

问题出在第二个字符串:20200601_LOY_Functional_PC_CampaignName1_NoPromo_CAN

结果(7行):

20200601Y功能性个人电脑CampaignName1无促销能够

没有简单的方法来判断缺少哪些字段。如果每个空白字段都有一个“占位符”下划线,我们可以将其计算在内,但实际情况并非如此。也许您可以设计一条规则。但是我不确定SQL是否是答案。也许您应该尝试使用Powershell之类的脚本语言来拆分字符串,查看模式并找出存在/缺失的字段。


0
投票

另一个选择是一点XML

不清楚,如果要用逗号分隔的字符串或单独的列

示例

Declare @YourTable Table ([ID] varchar(50),[SomeCol] varchar(150))
Insert Into @YourTable Values 
 (1,'Automation_LOY_Loyalty_PC_CampaignName3-Abandoners-Email1_NoPromo_USA')
,(2,'20200601_LOY_Functional_PC_CampaignName1_NoPromo_CAN')

Select A.ID 
      ,B.Pos1
      ,B.Pos2
      ,B.Pos3
      ,B.Pos4
      ,Pos5a = ltrim(rtrim(xmlData.value('/x[1]','varchar(max)')))
      ,Pos5b = ltrim(rtrim(xmlData.value('/x[2]','varchar(max)')))
      ,Pos5c = ltrim(rtrim(xmlData.value('/x[3]','varchar(max)')))
      ,B.Pos6
      ,B.Pos7
 From  @YourTable A
 Cross Apply ( 
                Select Pos1 = ltrim(rtrim(xDim.value('/x[1]','varchar(max)')))
                      ,Pos2 = ltrim(rtrim(xDim.value('/x[2]','varchar(max)')))
                      ,Pos3 = ltrim(rtrim(xDim.value('/x[3]','varchar(max)')))
                      ,Pos4 = ltrim(rtrim(xDim.value('/x[4]','varchar(max)')))
                      ,Pos5 = ltrim(rtrim(xDim.value('/x[5]','varchar(max)')))
                      ,Pos6 = ltrim(rtrim(xDim.value('/x[6]','varchar(max)')))
                      ,Pos7 = ltrim(rtrim(xDim.value('/x[7]','varchar(max)')))
                From  ( values (cast('<x>' + replace((Select replace(SomeCol,'_','§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml)))  A(xDim)
             ) B
 Cross Apply ( values (cast('<x>' + replace((Select replace(Pos5,'-','§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml) ) ) C(xmlData)

返回

enter image description here

© www.soinside.com 2019 - 2024. All rights reserved.