文本文件转csv脚本

问题描述 投票:0回答:1

我们正在使用一个脚本来接收一些数据并将其转换为 csv 文件,我们使用的脚本如下:

!/bin/bash

  read -p "Enter /DIR/PATH/FILENAME where you wish to copy the data: " FILENAME


 cat test.txt |
    awk '
    BEGIN {
        OFS = ","
        numTags = split("insert_job job_type box_name command machine owner date_conditions condition run_calendar exclude_calendar days_of_week run_window start_times start_mins resources profile term_run_time watch_file watch_interval",tags)
        for ( tagNr=1; tagNr<=numTags; tagNr++ ) {
            tag = tags[tagNr]
            printf "\"%s\"%s", tag, (tagNr<numTags ? OFS : ORS)
        }
    }

    !NF || /^\/\*/ { next }
    { gsub(/^[[:space:]]+|[[:space:]]+$/,"") }

    match($0,/[[:space:]]job_type:/) {
        if ( jobNr++ ) {
            prt()
            delete tag2val
        }

        # save "insert_job" value
        tag = substr($1,1,length($1)-1)
        val = substr($0,length($1)+1,RSTART-(length($1)+2))
        gsub(/^[[:space:]]+|[[:space:]]+$/,"",val)
        tag2val[tag] = val

        # update $0 to start with "job_type" to look like all other input
        $0 = substr($0,RSTART+1)
    }

    {
        tag = val = $0
        sub(/:.*/,"",tag)
        sub(/[^:]+:[[:space:]]*/,"",val)
        tag2val[tag] = val
    }

    END { prt() }

    function prt(    tagNr,tag,val) {
        for ( tagNr=1; tagNr<=numTags; tagNr++ ) {
            tag = tags[tagNr]
            val = tag2val[tag]
            printf "\"%s\"%s", val, (tagNr<numTags ? OFS : ORS)
        }
    }
' > $FILENAME.csv

以下是测试数据:


$ cat test.txt
/* ----------------- Test_A ----------------- */

insert_job: Test_A  job_type: CMD
command: sleep 3000
machine: machine1
owner: user1
permission:
date_conditions: 1
days_of_week: mo,tu,we,th,fr
start_times: “06:00,08:00,10:00,12:00,14:00,16:00”
term_run_time: 1
alarm_if_fail: 1
alarm_if_terminated: 1


/* ----------------- Test_B ----------------- */

insert_job: Test_B    job_type: CMD
command: echo
machine: machine2
owner: user2
permission:
date_conditions: 0
description: "Test"
std_out_file: "/tmp/$AUTO_JOB_NAME.$AUTORUN.out"
std_err_file: "/tmp/$AUTO_JOB_NAME.$AUTORUN.err"
max_run_alarm: 1
alarm_if_fail: 0
alarm_if_terminated: 0
send_notification: 1



/* ----------------- Test_c ----------------- */

insert_job: Test_c   job_type: CMD
command: sleep 10
machine: machine3
owner: user3
permission:
date_conditions: 0
alarm_if_fail: 0
alarm_if_terminated: 0


/* ----------------- Test_d ----------------- */

insert_job: Test_d   job_type: CMD
command: ls
machine: machine4
owner: user4
permission:
date_conditions: 0
alarm_if_fail: 1
alarm_if_terminated: 1



我正在执行脚本,得到以下结果:


"insert_job","job_type","box_name","command","machine","owner","date_conditions","condition","run_calendar","exclude_calendar","days_of_week","run_window","start_times","start_mins","resources","profile","term_run_time","watch_file","watch_interval"
"Test_A","CMD","","sleep 3000","machine1","user1","1","","","","mo,tu,we,th,fr","",""06:00,08:00,10:00,12:00,14:00,16:00"","","","","1","",""
"Test_B","CMD","","echo","machine2","user2","0","","","","","","","","","","","",""
"Test_c","CMD","","sleep 10","machine3","user3","0","","","","","","","","","","","",""
"Test_d","CMD","","ls","machine4","user4","0","","","","","","","","","","","",""

如果您看到输出,我会在

start_times
列下得到额外的“”。

我正在寻找的输出是:


"insert_job","job_type","box_name","command","machine","owner","date_conditions","condition","run_calendar","exclude_calendar","days_of_week","run_window","start_times","start_mins","resources","profile","term_run_time","watch_file","watch_interval"
"Test_A","CMD","","sleep 3000","machine1","user1","1","","","","mo,tu,we,th,fr","","06:00,08:00,10:00,12:00,14:00,16:00","","","","1","",""
"Test_B","CMD","","echo","machine2","user2","0","","","","","","","","","","","",""
"Test_c","CMD","","sleep 10","machine3","user3","0","","","","","","","","","","","",""
"Test_d","CMD","","ls","machine4","user4","0","","","","","","","","","","","",""

我需要对脚本进行哪些更改才能补偿输入数据中的额外“”

linux bash shell awk scripting
1个回答
0
投票

我需要对脚本进行哪些更改才能补偿 输入数据中额外的“”

你在做

printf "\"%s\"%s", val, (tagNr<numTags ? OFS : ORS)

如果

val
已经有前导和尾随
"
,那么您将额外添加
"
,从而获得
""06:00,08:00,10:00,12:00,14:00,16:00""
,以避免您可能会子串
val
(如果它以
"
开头并以
"
结尾)。设
file.txt
内容为

something_without_quotes
"something_in_quotes"
doubled""quote

然后

awk '{val=$1;if(val~/^".*"$/){val=substr(val,2,length(val)-2)};printf "\"%s\"\n",val}' file.txt

提供输出

"something_without_quotes"
"something_in_quotes"
"doubled""quote"

说明:如果

val
中的值以
"
开头并以
"
结尾,我使用
substr
字符串函数
通过从第二个字符开始并获取长度等于 的长度的子字符串来删除第一个和最后一个字符原始字符串减去 2。对于每一行,我将
printf
值括在
"
中。

(在 GNU Awk 5.1.0 中测试)

© www.soinside.com 2019 - 2024. All rights reserved.