我有一个要求,我需要使用 SQL 加载程序将数据文件中的数据加载到表中。但是,对于类似频道,数据将在一行中提供。 ROW_ID 是主键,使用序列生成。 IDENTIFY 是内部通道标识符。 IDENTIFY 不能作为主键,因为它违反第一范式。以下是数据文件:
IDENTIFY|CHANNEL_NAME|CHANNEL_PARTNERS |LOAD_DATE
1-ED |WEBSITE |"redbus","abhibus","amazon travel","irctc" |02-FEB-2022
1-LP |WALKIN |"physical reservation","printed reservation","current reservation"|04-FEB-2022
然而,要加载到数据库中的数据是这样使用 SQLLDR 的。
IDENTIFY CHANNEL_NAME CHANNEL_PARTNERS LOAD_DATE
1-ED WEBSITE redbus 02-FEB-2022
1-ED WEBSITE abhibus 02-FEB-2022
1-ED WEBSITE amazon travel 02-FEB-2022
1-ED WEBSITE irctc 02-FEB-2022
1-LP WALKIN physical reservation 04-FEB-2022
1-LP WALKIN printed reservation 04-FEB-2022
1-LP WALKIN current reservation 04-FEB-2022
下面是CTL文件。
load data
infile 'mchannel.txt'
append into table master_channel
fields terminated by "|"
(
row_id "chan_seq.nextval",
identify,
channel_name,
channel_partners,
load_date
)
如何使用 SQLLDR 实现此目的?
首先可以创建子表
CREATE TABLE master_channel_
(
identify VARCHAR2(15),
channel_names VARCHAR2(25),
channel_partners VARCHAR2(500),
load_date VARCHAR2(25)
);
并通过
将数据加载到其中$ . mchannel.sh
内容在哪里
sqlldr userid="un/pwd"@thedb control=mchannel.ctl log=mchannel.log bad=mchannel.bad errors=99999999 direct=y
和.ctl文件是
OPTIONS(skip=1)
LOAD DATA
INFILE 'mchannel.txt' "str '\n'"
TRUNCATE INTO TABLE master_channel_
FIELDS TERMINATED BY '|'
(
identify CHAR(15) "TRIM(:identify)",
channel_names CHAR(25) "TRIM(:channel_names)",
channel_partners CHAR(500) "TRIM(:channel_partners)",
load_date CHAR(25) "TRIM(:load_date)"
)
然后使用以下查询
INSERT INTO master_channel(identify,channel_names,channel_partners,load_date)
SELECT m.identify,
m.channel_names,
TRIM( BOTH '"' FROM REGEXP_SUBSTR(m.channel_partners,'[^,]+',1,level)),
TO_DATE(m.load_date,'dd-MON-yyyy')
FROM master_channel_ m
CONNECT BY level <= REGEXP_COUNT(m.channel_partners,',')+1
AND prior sys_guid() IS NOT NULL
AND prior m.identify = m.identify;
在应用所需的转换时插入到主表中,假设表创建为
CREATE TABLE master_channel
(
row_id NUMBER generated always as identity,
identify VARCHAR2(15),
channel_names VARCHAR2(25),
channel_partners VARCHAR2(500),
load_date DATE
)
正如 Paul W 所建议的那样,一个简单的 AWK 可以完成准备文件的工作,您可以借此机会删除双引号(和/或根据需要更改分隔符):
awk -F '|' '{split($3,a,/,/)}; { for (s in a) { print $1 "|" $2 "|" substr( a[s], 2, length(a[s])-2 ) "|" $4 } }'
1-ED|WEBSITE|abhibus|02-FEB-2022
1-ED|WEBSITE|amazon travel|02-FEB-2022
1-ED|WEBSITE|irctc|02-FEB-2022
1-ED|WEBSITE|redbus|02-FEB-2022
1-LP|WALKIN|printed reservation|04-FEB-2022
1-LP|WALKIN|current reservation|04-FEB-2022
1-LP|WALKIN|physical reservation|04-FEB-2022