如何在“PARTITIONED BY”子句的括号之间以逗号分隔的tick中提取值

问题描述 投票:-2回答:1

我有shell脚本,它为数据库中的所有表提取create table语句的语法。我一次循环一个create table语句,create table语句将作为循环中的变量$ DATA。我需要在partitioned by子句中的create table语句中提取列。

例如,$ DATA是循环中的变量

向循环输入迭代1:

DATA="CREATE TABLE `xxx`( `path` varchar(200), `fsize` bigint, `usrname` varchar(100)) PARTITIONED BY ( `depth` int, `permi` varchar(100)) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 'xxx' TBLPROPERTIES ( 'transient_lastDdlTime'='1519784177')"

迭代1的输出:dataoutput = depth,permi

向循环输入迭代2:

DATA="CREATE TABLE `xxx`( `path` varchar(200), `fsize` bigint, `usrname` varchar(100)) PARTITIONED BY ( `depth` int) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 'xxx' TBLPROPERTIES ( 'transient_lastDdlTime'='1519784177')"

迭代2的输出:dataoutput = depth

向循环输入迭代3:

DATA="CREATE TABLE `xxx`( `path` varchar(200), `fsize` bigint, `usrname` varchar(100)) PARTITIONED BY ( `depth` int, `permi` varchar(100), `www` int) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 'xxx' TBLPROPERTIES ( 'transient_lastDdlTime'='1519784177')"

迭代3的输出:dataoutput = depth,permi,www

shell perl awk sed
1个回答
0
投票

试试这个:

my @bcktik = "";
while(<DATA>)
{
    if($_=~m/PARTITIONED BY\s*\(((?:\(.*\)|[^\(])*)\)/i)
    {
        push(@bcktik, join "\,", ($1=~m/`([^`]*)`/g));
    }
}
print "$_\n" for @bcktik;

__DATA__
CREATE TABLE `xxx`( `path` varchar(200), `fsize` bigint, `usrname` varchar(100)) PARTITIONED BY ( `depth` int, `permi` varchar(100)) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 'xxx' TBLPROPERTIES ( 'transient_lastDdlTime'='1519784177')

CREATE TABLE `xxx`( `path` varchar(200), `fsize` bigint, `usrname` varchar(100)) PARTITIONED BY ( `depth` int) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 'xxx' TBLPROPERTIES ( 'transient_lastDdlTime'='1519784177')

CREATE TABLE `xxx`( `path` varchar(200), `fsize` bigint, `usrname` varchar(100)) PARTITIONED BY ( `depth` int, `permi` varchar(100), `www` int) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 'xxx' TBLPROPERTIES ( 'transient_lastDdlTime'='1519784177')
© www.soinside.com 2019 - 2024. All rights reserved.