我有一个 csv 格式的文件,我正在尝试使用 python 将其转换为 xml。现在的逻辑是在 csv 中的每一行构建一个 xml 块。 我正在寻找 xml 中某些元素的嵌套。 例如我的 csv 看起来像:
id addr_type addr_line1 addr_line2 city country
1 Home 123 Street GR Avenue NJ USA
1 Office 431 Street PO Towers NY USA
正在生成现有 XML
<address>
<id>1</id>
<addr_type>Home</addr_type>
<addr_line1>123 Street</addr_line1>
<addr_line2>GR Avenue</addr_line2>
<city>NJ</city>
<country>USA</country>
</address>
<address>
<id>1</id>
<addr_type>Office</addr_type>
<addr_line1>431 Street</addr_line1>
<addr_line2>PO Towers</addr_line2>
<city>NY</city>
<country>USA</country>
</address>
预期输出:
<id>1</id>
<address>
<addr>
<addr_type>Home</addr_type>
<addr_line1>123 Street</addr_line1>
<addr_line2>GR Avenue</addr_line2>
<city>NJ</city>
<country>USA</country>
</addr>
<addr>
<addr_type>Office</addr_type>
<addr_line1>431 Street</addr_line1>
<addr_line2>PO Towers</addr_line2>
<city>NY</city>
<country>USA</country>
</addr>
</address>
我当前的Python代码:
import csv
filename='addr.csv'
output_file='output.xml'
def convert_row(headers,row):
s=f'<row>\n'
for header,item in zip(headers,row):
s+=f' <{header}>' + f'{item}' + f'{/header}>\n'
return s + '</row>'
with open(filename,'r') as f:
r=csv.reader(f)
headers=next(r)
xml='<Address>\n'
for row in r:
xml+=convert_row(headers,row) + '\n'
xml+='</Address>'
print(xml)
注意:需要在没有 Pandas 库的情况下执行此操作。
请帮忙
这是一个如何仅使用
csv
/itertools
模块生成所需字符串的示例:
import csv
from io import StringIO
from itertools import groupby
csv_text = """\
id,addr_type,addr_line1,addr_line2,city,country
1,Home,123 Street,GR Avenue,NJ,USA
1,Office,431 Street,PO Towers,NY,USA"""
data = csv.reader(StringIO(csv_text))
header = next(data)
data = sorted(list(data), key=lambda row: row[0])
out = []
for _id, g in groupby(data, key=lambda k: k[0]):
out.append(f"<id>{_id}</id>")
out.append("<address>")
for addr in g:
out.append("<addr>")
for h, i in zip(header, addr):
if h == "id":
continue
out.append(f"\t<{h}>{i}</{h}>")
out.append("</addr>")
out.append("</address>")
print("\n".join(out))
打印:
<id>1</id>
<address>
<addr>
<addr_type>Home</addr_type>
<addr_line1>123 Street</addr_line1>
<addr_line2>GR Avenue</addr_line2>
<city>NJ</city>
<country>USA</country>
</addr>
<addr>
<addr_type>Office</addr_type>
<addr_line1>431 Street</addr_line1>
<addr_line2>PO Towers</addr_line2>
<city>NY</city>
<country>USA</country>
</addr>
</address>