我需要一些指导,并寻求更好的解析字符串的方法。
示例1:
String:Automation_LOY_Loyalty_PC_CampaignName3-Abandoners-Email1_NoPromo_USA
CSV:自动化,忠诚度,忠诚度,PC,CampaignName3,放弃者,电子邮件1,NoPromo,美国
示例2:
String:20200601_LOY_Functional_PC_CampaignName1_NoPromo_CAN
CSV:20200601,LOY,Functional,PC,CampaignName1 ,,, No Promo,CAN
如您所见,某些字符串没有所有字段,因此某些csv字段需要留空。
目前,我正在下面使用此代码,它非常混乱。除了像这样使用CHARINDEX之外,还有其他更好的方法来处理它吗?
SELECT [EmailName]
-- Deployment Type
,CASE LEFT([EmailName], (CHARINDEX('_', [EmailName])) - 1)
WHEN 'Automation' THEN 'Automation'
ELSE 'AdHoc' END AS [Deployment]
-- Type
,SUBSTRING([EmailName], (CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1)) + 1, CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1) + 1) - CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1) - 1) AS [Type]
-- Customer_Type
,SUBSTRING([EmailName], CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1) + 1) + 1, CHARINDEX('_', [EmailName],CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1)) + 1) + 1)) - CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1)) + 1) - 1) AS [Customer_Type]
-- Campaign_Name
,CASE WHEN (CHARINDEX('-', [EmailName])) = 0 THEN SUBSTRING([EmailName], (CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1)) + 1) + 1))) + 1, (CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1)) + 1) + 1)) + 1))) - (CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1)) + 1) + 1))) - 1)
WHEN (CHARINDEX('-', [EmailName])) > 0 THEN SUBSTRING([EmailName], (CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1)) + 1) + 1))) + 1, ((CHARINDEX('-', [EmailName])) - (CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1)) + 1) + 1)))) - 1)
ELSE NULL END AS [Campaign_Name]
,CASE WHEN (CHARINDEX('-', [EmailName])) = 0 THEN 1
WHEN (CHARINDEX('-', [EmailName])) > 0 THEN REPLACE(SUBSTRING([EmailName], ((CHARINDEX('-', [EmailName])) + 1), (CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName], (CHARINDEX('_', [EmailName])) + 1)) + 1) + 1)) + 1))) - (CHARINDEX('-', [EmailName])) - 1),'Email','')
ELSE NULL END AS [Email_Num]
FROM TableName
我不会完全回答这个问题,但是无论如何都会给你一些线索。在SQL Server中,可以使用STRING_SPLIT函数分割字符串。但是它仅接受一个定界符。由于您似乎想在连字符(-
)和下划线(_
)上进行拆分,因此可以做的是:
示例:
declare @str varchar(500)
set @str = 'Automation_LOY_Loyalty_PC_CampaignName3-Abandoners-Email1_NoPromo_USA'
set @str = REPLACE(@str, '-', '_')
SELECT
value
FROM
STRING_SPLIT(@str,'_')
结果(9行):
自动化Y忠诚个人电脑CampaignName3放弃者电邮1无促销美国
问题出在第二个字符串:20200601_LOY_Functional_PC_CampaignName1_NoPromo_CAN
。
结果(7行):
20200601Y功能性个人电脑CampaignName1无促销能够
没有简单的方法来判断缺少哪些字段。如果每个空白字段都有一个“占位符”下划线,我们可以将其计算在内,但实际情况并非如此。也许您可以设计一条规则。但是我不确定SQL是否是答案。也许您应该尝试使用Powershell之类的脚本语言来拆分字符串,查看模式并找出存在/缺失的字段。
另一个选择是一点XML
不清楚,如果要用逗号分隔的字符串或单独的列
示例
Declare @YourTable Table ([ID] varchar(50),[SomeCol] varchar(150))
Insert Into @YourTable Values
(1,'Automation_LOY_Loyalty_PC_CampaignName3-Abandoners-Email1_NoPromo_USA')
,(2,'20200601_LOY_Functional_PC_CampaignName1_NoPromo_CAN')
Select A.ID
,B.Pos1
,B.Pos2
,B.Pos3
,B.Pos4
,Pos5a = ltrim(rtrim(xmlData.value('/x[1]','varchar(max)')))
,Pos5b = ltrim(rtrim(xmlData.value('/x[2]','varchar(max)')))
,Pos5c = ltrim(rtrim(xmlData.value('/x[3]','varchar(max)')))
,B.Pos6
,B.Pos7
From @YourTable A
Cross Apply (
Select Pos1 = ltrim(rtrim(xDim.value('/x[1]','varchar(max)')))
,Pos2 = ltrim(rtrim(xDim.value('/x[2]','varchar(max)')))
,Pos3 = ltrim(rtrim(xDim.value('/x[3]','varchar(max)')))
,Pos4 = ltrim(rtrim(xDim.value('/x[4]','varchar(max)')))
,Pos5 = ltrim(rtrim(xDim.value('/x[5]','varchar(max)')))
,Pos6 = ltrim(rtrim(xDim.value('/x[6]','varchar(max)')))
,Pos7 = ltrim(rtrim(xDim.value('/x[7]','varchar(max)')))
From ( values (cast('<x>' + replace((Select replace(SomeCol,'_','§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml))) A(xDim)
) B
Cross Apply ( values (cast('<x>' + replace((Select replace(Pos5,'-','§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml) ) ) C(xmlData)
返回