我需要使用逗号(,)作为分隔符将数据导入.csv文件。
我正在使用以下sqoop选项。
-可选用'\“'括起来--escaped-by'\\'
下面是我想要的输入数据和输出数据。
input “ foo输出我想要”“ foo但我要低于输入“ foo输出” foo
另一个例子:
输入foo“输出我想要foo”“但我要低于输入foo“输出foo”
我如何获得所需的输出
请参阅SqoopGuide 7.2.11. Large Objects,以更好地理解-封闭式,-逃逸式和-可选地封闭式样例。
根据问题,以下是所了解的细节。
--fields-terminated-by , Since you need a file with a comma as the delimiter.
--optionally-enclosed-by '\"' This will enclose only the fields whose data contains delimiter comma , in them.
--escaped-by \\ Used to escape the enclosing characters(double quotes in this case) if they are present in the data field which requires enclosing.
示例:
输入:假设源表中的数据是否与下面的各列相同。为了表示,我使用pipe(|)作为分隔符。
Some string, with a comma.|1|2|3...
Another "string with quotes"|4|5|6...
输出:sqoop import --fields-terminated-by,--enclosed-by'\“'--escaped-by \ ...
"Some string, with a comma.","1","2","3"...
"Another \"string with quotes\"","4","5","6"...
说明:所有字段都以逗号结尾,所有字段都用双引号引起来。如果数据中有任何带双引号的字段,则这些引号将由反斜杠()进行转义,如第二行所示。
输出:sqoop import --fields-terminated-by,--optional-enclosed-by'\“'--escaped-by \ ...
"Some string, with a comma.",1,2,3...
"Another \"string with quotes\"",4,5,6...
解释:所有字段都以逗号结尾,并且只有与逗号联系的字段才用双引号引起来。如果数据中有任何带双引号的字段,则这些引号将由反斜杠()转义,如第二行所示,甚至此列也将与第二行一起括起来。
针对您的情况:
输入:假设源表中的数据是否与下面的相应列一样。为了表示,我使用pipe(|)作为分隔符。
"foo|bar"|1|2
foo"|3|4|"bar
可能的输出:sqoop import --fields-terminated-by,--enclosed-by'\“'--escaped-by \ ...
"\"foo","bar\"","1","2"
"foo\"","3","4","\"bar"
可能的输出:sqoop import --fields-terminated-by,--optional-enclosed-by'\“'--escaped-by \ ...
"\"foo","bar\"",1,2
"foo\"",3,4,"\"bar"