我对键值对的复杂sqlldr解析的搜索很薄。所以发布一个适合我的需求的例子,你可以适应。
问题:数百万行Tomcat访问日志,例如:
time='[01/Jan/2001:00:00:03 +0000]' srcip='192.168.0.1' localip='10.0.0.1' referer='-' url='/limsM/SamplesGet-SampleMaster?samplefilters=%5B%22parent_sample%20%3D%208504571%22%2C%22status%20%3D%20'D'%22%5D&depthfilters=%5B%22scale_id%20%3D%2011311%22%5D' servername='yo.yo.dyne.org' rspms='218' rspbytes='2198'
将被分析到此Oracle表中,以便于分析所选参数。
create table transfer.loganal (
time date
, timestr varchar2(30)
, srcip varchar2(75)
, localip varchar2(15)
, referer clob
, uri clob
, servername varchar2(50)
, rspms number
, rspbytes number
, logsource varchar2(50)
);
sqlldr控制脚本看起来会是什么样的呢?
这是我的第一个工作解决方案。随时欢迎改进,建议和改进。
给定Tomcat访问目录中的日志,例如
yoyotomcat/
combined.20010101
combined.20010102
...
这个文件保存为combined.ctl作为yoyotomcat的兄弟
-- Load an Apache common log format
-- essentially key-value pairs
-- example line of source data
-- time='[01/Jan/2001:00:00:03 +0000]' srcip='192.168.0.1' localip='10.0.0.1' referer='-' url='/limsM/SamplesGet-SampleMaster?samplefilters=%5B%22parent_sample%20%3D%208504571%22%2C%22status%20%3D%20'D'%22%5D&depthfilters=%5B%22scale_id%20%3D%2011311%22%5D' servername='yo.yo.dyne.org' rspms='218' rspbytes='2198'
--
LOAD DATA
INFILE 'yoyodyne/combined.2001*' "STR '\n'"
TRUNCATE INTO TABLE transfer.loganal
TRAILING NULLCOLS
(
time enclosed by "time='[" and "+0000]' " "to_date(:time, 'dd/Mon/yyyy:hh24:mi:ss')"
, srcip enclosed by "srcip='" and "' "
, localip enclosed by "localip='" and "' "
, referer char(10000) enclosed by "referer='" and "' "
, uri char(10000) enclosed by "url='" and "' "
, servername enclosed by "servername='" and "' "
, rspms enclosed by "rspms='" and "' " "decode(:rspms, '-', null, to_number(:rspms))"
, rspbytes enclosed by "rspbytes='" and "'" "decode(:rspbytes, '-', null, to_number(:rspbytes))"
, logsource "'munchausen'"
)
通过从命令提示符运行来加载假设的示例内容
sqlldr userid=buckaroo@banzai direct=true control=combined.ctl
你的旅费可能会改变。我在Oracle 12上。这里使用的功能可能相对较新。不确定。
照明
干杯。