用CSV替换双引号(在限定符内)以进行SSIS导入

问题描述 投票:2回答:4

我有一个SSIS包从.csv文件导入数据。此文件的每个条目都有doulbe引号(")限定符,但也介于两者之间。我还添加了逗号(,)作为列分隔符。我无法向您提供我正在使用的原始数据,但这里是一个示例我的数据如何在平面文件源中传递:

"ID-1","A "B"", C, D, E","Today"
"ID-2","A, B, C, D, E,F","Yesterday"
"ID-3","A and nothing else","Today"

正如您所看到的,第二列可以包含引号(和逗号),这些引号会破坏我的SSIS导入,并指向此行。我对正则表达式并不熟悉,但我听说这可能对这种情况有所帮助。

在我看来,我需要用单引号(qazxsw poi)替换所有双引号(qazxsw poi),除了......

  • ...所有引号都在一行的开头
  • ...所有引号都在一行的末尾
  • ...引用是"的一部分

你能有人帮我解决这个问题吗?会很好!

提前致谢!

regex csv import ssis quote
4个回答
1
投票

要根据您的规范用单引号替换双引号,请使用此简单的正则表达式。此正则表达式将允许行的开头和/或结尾处的空格。

'

这是模式的解释:

","

0
投票

您可以使用正则表达式匹配模式拆分列

string pattern = @"(?<!^\s*|,)""(?!,""|\s*$)";
string resultString = Regex.Replace(subjectString, pattern, "'", RegexOptions.Multiline);

See this // (?<!^\s*|,)"(?!,"|\s*$) // // Options: ^ and $ match at line breaks // // Assert that it is impossible to match the regex below with the match ending at this position (negative lookbehind) «(?<!^\s*|,)» // Match either the regular expression below (attempting the next alternative only if this one fails) «^\s*» // Assert position at the beginning of a line (at beginning of the string or after a line break character) «^» // Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s*» // Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*» // Or match regular expression number 2 below (the entire group fails if this one fails to match) «,» // Match the character “,” literally «,» // Match the character “"” literally «"» // Assert that it is impossible to match the regex below starting at this position (negative lookahead) «(?!,"|\s*$)» // Match either the regular expression below (attempting the next alternative only if this one fails) «,"» // Match the characters “,"” literally «,"» // Or match regular expression number 2 below (the entire group fails if this one fails to match) «\s*$» // Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s*» // Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*» // Assert position at the end of a line (at the end of the string or before a line break character) «$» .


0
投票

使用双引号和逗号加载CSV时,有一个限制是添加了额外的双引号,并且数据也附加了双引号,您可以在源文件的预览中查看。因此,添加派生列任务并给出以下表达式: -

(REPLACE(REPLACE(RIGHT((((((((((((((((((( “),”@“,”“)

粗体部分删除用双引号括起来的数据。

试试这个并告诉我这是否有用


0
投票

在将值插入CSV目标之前,将文本限定符/(?:(?<=^")|(?<=",")).*?(?:(?="\s*$)|(?=","))/g 用于CSV目标,添加派生列表达式

demo

这将在您的文本字段中保留"

© www.soinside.com 2019 - 2024. All rights reserved.