我在HDFS上有一个csv文件,我正试图创建一个impala表,情况是它创建的表和值与所有的" "。
CREATE external TABLE abc.def
(
name STRING,
title STRING,
last STRING,
pno STRING
)
row format delimited fields terminated by ','
location 'hdfs:pathlocation'
tblproperties ("skip.header.line.count"="1") ;
输出是名称瓷砖最后的pno "abc""mr""xyz""1234""rew""ms""pre""654"
我只是想从csv文件中创建一个没有引号的表。请指导我哪里错了。Regards, R
一种方法是创建一个阶段表,加载文件与引号,然后用CTAS(创建表作为选择)创建正确的表,清理字段与替换功能。
CREATE TABLE quote_stage(
id STRING,
name STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE;
+-----+----------+
| id | name |
+-----+----------+
| "1" | "pepe" |
| "2" | "ana" |
| "3" | "maria" |
| "4" | "ramon" |
| "5" | "lucia" |
| "6" | "carmen" |
| "7" | "alicia" |
| "8" | "pedro" |
+-----+----------+
CREATE TABLE t_quote
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE
AS SELECT replace(id,'"','') AS id, replace(name,'"','') AS name FROM quote_stage;
+----+--------+
| id | name |
+----+--------+
| 1 | pepe |
| 2 | ana |
| 3 | maria |
| 4 | ramon |
| 5 | lucia |
| 6 | carmen |
| 7 | alicia |
| 8 | pedro |
+----+--------+
希望对你有所帮助。