我正在将错误日志平面文件导入我的 SQL 服务器,并且需要将制表符分隔的列解析为多个列。大量借鉴这个问题(SQL Split Tab Delimited Column),特别是@Lobo 的回答,我想完成几件事:
第一个目标(动态列数)我暂时可以没有,但这是我遇到问题的第二个目标。
DECLARE @SAMPLE_TABLE table(
[Column 0] nvarchar(4000),
[Filename] nvarchar(260),
FileExtention varchar(255),
DateTimeStamp datetime,
CustomerNumber varchar(255),
FileType varchar(255),
ImportSetNumber varchar(255)
)
表格的一些示例数据:
[Column 0] | [Filename] | FileExtention | DateTimeStamp | CustomerNumber | FileType | ImportSetNumber
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1<tab>Import Set No (A): 03300001: Contact ID (G): Invalid contact ID for this customer and company....Taker (I): Invalid taker.<tab><tab>| E:\path\to\files\Errors\SO_OHF_10047_20230330113636_03300001.err | err | 2023-03-30 11:36:36.000 | 10047 | OHF | 03300001
1<tab>Import Set No (A): 03300001: General Error: This Record and its related Records failed validation.<tab>0<tab>218 | E:\path\to\files\Errors\SO_OHF_10047_20230330113636_03300001.err | err | 2023-03-30 11:36:36.000 | 10047 | OHF | 03300001
1<tab>Import Set No (A): 04040186: General Error: This Record and its related Records failed validation.<tab>0<tab>17 | E:\path\to\files\Errors\SO_OHF_18120_20230404084926_04040186.err | err | 2023-04-04 08:49:26.000 | 18120 | OHF | 04040186
让游戏开始
;WITH
CTE_Columns AS(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) 'MyRowID',
[Filename],
FileExtention,
DateTimeStamp,
CustomerNumber,
FileType,
ImportSetNumber,
A.ColID 'ColumnNumber',
A.Cols 'ColumnValue'
FROM @SAMPLE_TABLE
CROSS APPLY (
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS ColID,
value [Cols]
FROM STRING_SPLIT([Column 0], CHAR(9)) -- split by tab character
)A
)
SELECT MyRowID,
[Filename],
FileExtention,
DateTimeStamp,
CustomerNumber,
FileType,
ImportSetNumber,
NULLIF(TRIM([1]), '') 'FirstColumn',
NULLIF(TRIM([2]), '') 'SecondColumn',
NULLIF(TRIM([3]), '') 'ThirdColumn',
NULLIF(TRIM([4]), '') 'FourthColumn'
FROM (
SELECT MyRowID,
[Filename],
FileExtention,
DateTimeStamp,
CustomerNumber,
FileType,
ImportSetNumber,
ColumnNumber,
ColumnValue
FROM CTE_Columns
)Q
PIVOT(MAX(Q.ColumnValue) FOR ColumnNumber IN([1], [2], [3], [4])) PIV
ORDER BY CustomerNumber,
ImportSetNumber
此查询产生以下结果集:
MyRowID | Filename | FileExtention | DateTimeStamp | CustomerNumber | FileType | ImportSetNumber | FirstColumn | SecondColumn | ThirdColumn | FourthColumn
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 | E:\path\to\files\Errors\SO_OHF_10047_20230330113636_03300001.err | err | 2023-03-30 11:36:36.000 | 10047 | OHF | 03300001 | 1 | NULL | NULL | NULL
2 | E:\path\to\files\Errors\SO_OHF_10047_20230330113636_03300001.err | err | 2023-03-30 11:36:36.000 | 10047 | OHF | 03300001 | NULL | Import Set No (A): 03300001: Contact ID (G): Invalid contact ID for this customer and company....Taker (I): Invalid taker. | NULL | NULL
3 | E:\path\to\files\Errors\SO_OHF_10047_20230330113636_03300001.err | err | 2023-03-30 11:36:36.000 | 10047 | OHF | 03300001 | NULL | NULL | NULL | NULL
4 | E:\path\to\files\Errors\SO_OHF_10047_20230330113636_03300001.err | err | 2023-03-30 11:36:36.000 | 10047 | OHF | 03300001 | NULL | NULL | NULL | NULL
5 | E:\path\to\files\Errors\SO_OHF_10047_20230330113636_03300001.err | err | 2023-03-30 11:36:36.000 | 10047 | OHF | 03300001 | 1 | NULL | NULL | NULL
6 | E:\path\to\files\Errors\SO_OHF_10047_20230330113636_03300001.err | err | 2023-03-30 11:36:36.000 | 10047 | OHF | 03300001 | NULL | Import Set No (A): 03300001: General Error: This Record and its related Records failed validation. | NULL | NULL
7 | E:\path\to\files\Errors\SO_OHF_10047_20230330113636_03300001.err | err | 2023-03-30 11:36:36.000 | 10047 | OHF | 03300001 | NULL | NULL | 0 | NULL
8 | E:\path\to\files\Errors\SO_OHF_10047_20230330113636_03300001.err | err | 2023-03-30 11:36:36.000 | 10047 | OHF | 03300001 | NULL | NULL | NULL | 218
9 | E:\path\to\files\Errors\SO_OHF_18120_20230404084926_04040186.err | err | 2023-04-04 08:49:26.000 | 18120 | OHF | 04040186 | 1 | NULL | NULL | NULL
10 | E:\path\to\files\Errors\SO_OHF_18120_20230404084926_04040186.err | err | 2023-04-04 08:49:26.000 | 18120 | OHF | 04040186 | NULL | Import Set No (A): 04040186: General Error: This Record and its related Records failed validation. | NULL | NULL
11 | E:\path\to\files\Errors\SO_OHF_18120_20230404084926_04040186.err | err | 2023-04-04 08:49:26.000 | 18120 | OHF | 04040186 | NULL | NULL | 0 | NULL
12 | E:\path\to\files\Errors\SO_OHF_18120_20230404084926_04040186.err | err | 2023-04-04 08:49:26.000 | 18120 | OHF | 04040186 | NULL | NULL | NULL | 17
根据上面的结果集,第 1-4 行应该是一个记录,第 5-8 行应该是第二个记录,第 9-12 行应该是第三个记录,给了我想要的最终状态:
Filename | FileExtention | DateTimeStamp | CustomerNumber | FileType | ImportSetNumber | FirstColumn | SecondColumn | ThirdColumn | FourthColumn
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
E:\path\to\files\Errors\SO_OHF_10047_20230330113636_03300001.err | err | 2023-03-30 11:36:36.000 | 10047 | OHF | 03300001 | 1 | Import Set No (A): 03300001: Contact ID (G): Invalid contact ID for this customer and company....Taker (I): Invalid taker. | NULL | NULL
E:\path\to\files\Errors\SO_OHF_10047_20230330113636_03300001.err | err | 2023-03-30 11:36:36.000 | 10047 | OHF | 03300001 | 1 | Import Set No (A): 03300001: General Error: This Record and its related Records failed validation. | 0 | 218
E:\path\to\files\Errors\SO_OHF_18120_20230404084926_04040186.err | err | 2023-04-04 08:49:26.000 | 18120 | OHF | 04040186 | 1 | Import Set No (A): 04040186: General Error: This Record and its related Records failed validation. | 0 | 17
我相信这只是正确分组的简单问题,但我不确定要分组的内容,或正确放置分组的位置(无论是分区还是某个地方的标准 GROUP BY 子句)