从垂直到水平过滤数据,在 linux 中以字段作为唯一键

问题描述 投票:0回答:1

我有他们每个人开始的数据

<SUBBEGIN
SUBSCRIBERIDENTIFIER=803838478;
PAIDTYPE=0;
SUBSCRIPTION=TOOMUCH&73337E0380B4B30F&1&AAA&BBB&CCC&1&1&FFFFFFFFFFFFFFFF&255&1&255&256&FFFFFFFFFFFFFFFF&0&0&128&1&255&255&FFFFFFFFFFFFFF&FFFFFFFFFFFFFF&0&0&0&1&0&0&1;
SUBSCRIPTION=TASKS&E7CC601262AB3535&1&DDD&EEE&FFF&2&1&FFFFFFFFFFFFFFFF&255&0&255&256&FFFFFFFFFFFFFFFF&0&0&128&1&255&255&FFFFFFFFFFFFFF&FFFFFFFFFFFFFF&0&21&0&1&0&0&1;
<SUBEND
<SUBBEGIN
SUBSCRIBERIDENTIFIER=705959905;
PAIDTYPE=254;
SUBSCRIPTION=REALLY&73337E0380B4B30F&1&GGG&HHH&LLL&1&1&FFFFFFFFFFFFFFFF&255&1&255&256&FFFFFFFFFFFFFFFF&0&0&128&1&255&255&FFFFFFFFFFFFFF&FFFFFFFFFFFFFF&0&0&0&1&0&0&1;
SUBSCRIPTION=TIRED&E7CC601262AB3535&1&MMM&NNN&PPP&2&1&FFFFFFFFFFFFFFFF&255&0&255&256&FFFFFFFFFFFFFFFF&0&0&128&1&255&255&FFFFFFFFFFFFFF&FFFFFFFFFFFFFF&0&21&0&1&0&0&1;
<SUBEND    

打算做横版,只用一些字段,结果的Header plan是这样的:

SUBSCRIBERIDENTIFIER,,,PAIDTYPE,,1,255,SERVICENAME,SUBSCRIBEDATETIME,VALIDFROMDATETIME,EXPIREDDATETIME,,,,,

根据该数据:

SUBSCRIBERIDENTIFIER sample is 803838478 (we can see it in SUBSCRIBERIDENTIFIER)
PAIDTYPE sample is 0 (we can see it in PAIDTYPE)
SERVICENAME sample is TOOMUCH (we can see it in SUBSCRIPTION)
SUBSCRIBEDATETIME sample is AAA (we can see it in SUBSCRIPTION)
VALIDFROMDATETIME sample is BBB (we can see it in SUBSCRIPTION)
EXPIREDDATETIME sample is CCC (we can see it in SUBSCRIPTION)

所以预期的结果是这样的:

803838478,,,0,,1,255,TOOMUCH,AAA,BBB,CCC,,,,,
803838478,,,0,,1,255,TASKS,DDD,EEE,FFF,,,,,
705959905,,,254,,1,255,REALLY,GGG,HHH,LLL,,,,,
705959905,,,254,,1,255,TIRED,MMM,NNN,PPP,,,,,

我试过这个脚本:

awk -F"&" '/^<SUBBEGIN$/{a=1} a && /^[[:blank:]]+(SUBSCRIBERIDENTIFIER|PAIDTYPE|SUBSCRIPTION)/{l=l OFS $1} a && /^<SUBEND$/ {print l; a=l=""}' sample.txt

但结果不如预期:

SUBSCRIBERIDENTIFIER=803838478;         PAIDTYPE=0;         SUBSCRIPTION=TOOMUCH         SUBSCRIPTION=TASKS
SUBSCRIBERIDENTIFIER=705959905;         PAIDTYPE=254;         SUBSCRIPTION=REALLY         SUBSCRIPTION=TIRED

需要您的建议,谢谢。

linux filter
1个回答
0
投票
awk -F'[=;&]' -v OFS=',' '
    /<SUBBEGIN/,/<SUBEND/{
        if($1 == "SUBSCRIPTION"){
            i++
            a["SUBSCRIPTIONS"]=i
            a["SERVICENAME"i]=$2
            a["SUBSCRIBEDATETIME"i]=$5
            a["VALIDFROMDATETIME"i]=$6
            a["EXPIREDDATETIME"i]=$7
        }else{
            a[$1]=$2
        }
    }
    /<SUBEND/{
        for(i=1; i<=a["SUBSCRIPTIONS"]; i++){
            print ( \
                    a["SUBSCRIBERIDENTIFIER"], 
                    "","",
                    a["PAIDTYPE"],
                    "",1,255,
                    a["SERVICENAME"i],
                    a["SUBSCRIBEDATETIME"i],
                    a["VALIDFROMDATETIME"i],
                    a["EXPIREDDATETIME"i],
                    "","","","","" \
                )
        }
        i=0
     }
' file

803838478,,,0,,1,255,TOOMUCH,AAA,BBB,CCC,,,,,
803838478,,,0,,1,255,TASKS,DDD,EEE,FFF,,,,,
705959905,,,254,,1,255,REALLY,GGG,HHH,LLL,,,,,
705959905,,,254,,1,255,TIRED,MMM,NNN,PPP,,,,,
© www.soinside.com 2019 - 2024. All rights reserved.