我正在使用 python 将文本输入文件转换为 json。
我的代码:
import json
import re
filename = "text.txt"
text = {}
pattern = re.compile(r'\s*([^=\t]+)\s*=\s*(.*)')
with open(filename, encoding='utf8') as file:
for line in file:
match = pattern.match(line.strip())
if match:
key, value = match.groups()
text[key] = value
else:
key_value = line.strip().rsplit(maxsplit=1)
if len(key_value) == 2:
key, value = key_value
text[key] = value
with open("output.json", "w", encoding='utf-8') as output_file:
json.dump(text, output_file, indent=4, ensure_ascii=False, sort_keys=False)
我使用正则表达式来执行此操作。我给出以下作为输入:
I_KNO_DR=456
I_ff_DD=567
hello 23
hello world 34
Y=hi /// rtz 77
现在电流输出如下:
{
"I_KNO_DR": "456",
"I_ff_DD": "567",
"hello": "23",
"hello world": "34",
"Y": "hi /// rtz 77"
}
但是预期的输出应该如下:
{
"I_KNO_DR": "456",
"I_ff_DD": "567",
"hello": "23",
"hello world": "34",
"Y=hi /// rtz": "77"
}
输入输出的最后一行有问题。如何实现这个正确的输出。我在当前代码中犯了什么错误。还建议我是否应该做一些改进。
谢谢。
我将正则表达式更改为:
(.+)(?:[ \t]*=[ \t]*|[ \t]+)(.+)
这样您就可以匹配
=
或空格 (Regex101) 之后的最后一部分。
import re
input_string = """\
I_KNO_DR=456
I_ff_DD=567
hello 23
hello world 34
Y=hi /// rtz 77"""
out = dict(re.findall(r"(.+)(?:[ \t]*=[ \t]*|[ \t]+)(.+)", input_string))
print(out)
打印:
{
"I_KNO_DR": "456",
"I_ff_DD": "567",
"hello": "23",
"hello world": "34",
"Y=hi /// rtz": "77",
}
试试这个:
import json
import re
filename = "text.txt"
text = {}
pattern = re.compile('^(.*?)(\d+)$')
with open(filename, encoding='utf8') as file:
for line in file:
match = pattern.match(line.strip())
key, value = match.groups()
text[key] = value
with open("output.json", "w", encoding='utf-8') as output_file:
json.dump(text, output_file, indent=4, ensure_ascii=False, sort_keys=False)