使用shell脚本将文本文件中的问题转换成JSON格式

问题描述 投票:0回答:2

我有一个包含内容的文本文件:

//input.txt
1) What is the first month of the year?
a) March
b) February
c) January
d) December
Answer: c) January

2) What is the last month of the year?
a) July
b) December
c) August
d) May
Answer: b) December

我想编写一个循环遍历此文件 input.txt 的 shell 脚本(其中包含更多具有相同格式的内容)并生成类似于以下 JSON 的输出

[
 {
  "question": "What is the first month of the year?",
  "a": "March",
  "b": "February",
  "c": "January",
  "d": "December",
  "answer": "January",
 },
 {
  "question": "What is the last month of the year?",
  "a": "July",
  "b": "December",
  "c": "August",
  "d": "May",
  "answer": "December",
 },
[

我开始尝试编写一个 bash 脚本,循环遍历文件并将用空行分隔的每一行放入大括号中,并将大括号中的每一项放入引号中,并用逗号分隔,但它不是'工作

#!/bin/bash

output=""

while read line; do
  if [ -z "$line" ]; then
    output+="}\n"
  else
    output+="\"${line}\","
    if [ $(echo "$output" | tail -n 1) == "" ]; then
      output+="{"
    fi
  fi
done < input.txt

output+="}"

echo "$output" > output.txt

json bash
2个回答
1
投票

您将竭尽全力尝试使用 Bash 生成正确的 JSON。

首先,您的示例 JSON 输出不是正确的 JSON。数组和映射不支持尾随的

,
。所以你的例子需要是:

[{
        "question": "What is the first month of the year?",
        "a": "March",
        "b": "February",
        "c": "January",
        "d": "December",
        "answer": "January"
    },
    {
        "question": "What is the last month of the year?",
        "a": "July",
        "b": "December",
        "c": "August",
        "d": "May",
        "answer": "December"
    }
]

(注意在最后一个

,
之后的每个
"answer"
之后没有
}
。您使用工具或jsonlint检查有效的JSON)

要根据您的输入生成它,有许多 JSON 生成器工具。对我来说最简单的是 Ruby:

ruby -00 -r json -ne '
BEGIN{out=[]}
sub(/\A\d+\)\s+/,"question)")
out << $_.split(/\R/).map{|l| l.split(/[):]\s*/,2)}.to_h
END{puts JSON.pretty_generate(out)}' file 

印花:

[
  {
    "question": "What is the first month of the year?",
    "a": "March",
    "b": "February",
    "c": "January",
    "d": "December",
    "Answer": "c) January"
  },
  {
    "question": "What is the last month of the year?",
    "a": "July",
    "b": "December",
    "c": "August",
    "d": "May",
    "Answer": "b) December"
  }
]

1
投票

这是一种方法,与

-R
-s
标志一起使用:

sub("\n*$"; "") |
split("\n\n") | map(
  split("\n") | map(split(") ")) | [
    {question: .[0][1]},
    (.[1:-1][] | {(.[0]): .[1]}),
    {answer: .[-1][1]}
  ] | add
)

在线演示

相关问题
热门问答
最新问题