通过 Python 使用 CSV 的 XML 嵌套格式

问题描述 投票:0回答:1

我有一个 csv 格式的文件,我正在尝试使用 python 将其转换为 xml。现在的逻辑是在 csv 中的每一行构建一个 xml 块。 我正在寻找 xml 中某些元素的嵌套。 例如我的 csv 看起来像:

id  addr_type  addr_line1   addr_line2   city   country
1   Home       123 Street   GR Avenue    NJ     USA
1   Office     431 Street   PO Towers    NY     USA

正在生成现有 XML

<address>
<id>1</id>
<addr_type>Home</addr_type>
<addr_line1>123 Street</addr_line1>
<addr_line2>GR Avenue</addr_line2>
<city>NJ</city>
<country>USA</country>
</address>

<address>
<id>1</id>
<addr_type>Office</addr_type>
<addr_line1>431 Street</addr_line1>
<addr_line2>PO Towers</addr_line2>
<city>NY</city>
<country>USA</country>
</address>

预期输出:

<id>1</id>
<address>
<addr>
   <addr_type>Home</addr_type>
   <addr_line1>123 Street</addr_line1>
   <addr_line2>GR Avenue</addr_line2>
   <city>NJ</city>
   <country>USA</country>
</addr>
<addr>
   <addr_type>Office</addr_type>
   <addr_line1>431 Street</addr_line1>
   <addr_line2>PO Towers</addr_line2>
   <city>NY</city>
   <country>USA</country>
</addr>
</address>

我当前的Python代码:

import csv

filename='addr.csv'
output_file='output.xml'

def convert_row(headers,row):
   s=f'<row>\n'
   for header,item in zip(headers,row):
       s+=f'    <{header}>' + f'{item}' + f'{/header}>\n'
       return s + '</row>'
with open(filename,'r') as f:
   r=csv.reader(f)
   headers=next(r)
   xml='<Address>\n'

for row in r:
   xml+=convert_row(headers,row) + '\n'

xml+='</Address>'
print(xml)

注意:需要在没有 Pandas 库的情况下执行此操作。

请帮忙

python xml csv xml-parsing
1个回答
0
投票

这是一个如何仅使用

csv
/
itertools
模块生成所需字符串的示例:

import csv
from io import StringIO
from itertools import groupby

csv_text = """\
id,addr_type,addr_line1,addr_line2,city,country
1,Home,123 Street,GR Avenue,NJ,USA
1,Office,431 Street,PO Towers,NY,USA"""

data = csv.reader(StringIO(csv_text))
header = next(data)
data = sorted(list(data), key=lambda row: row[0])

out = []
for _id, g in groupby(data, key=lambda k: k[0]):
    out.append(f"<id>{_id}</id>")
    out.append("<address>")
    for addr in g:
        out.append("<addr>")
        for h, i in zip(header, addr):
            if h == "id":
                continue
            out.append(f"\t<{h}>{i}</{h}>")
        out.append("</addr>")
    out.append("</address>")

print("\n".join(out))

打印:

<id>1</id>
<address>
<addr>
        <addr_type>Home</addr_type>
        <addr_line1>123 Street</addr_line1>
        <addr_line2>GR Avenue</addr_line2>
        <city>NJ</city>
        <country>USA</country>
</addr>
<addr>
        <addr_type>Office</addr_type>
        <addr_line1>431 Street</addr_line1>
        <addr_line2>PO Towers</addr_line2>
        <city>NY</city>
        <country>USA</country>
</addr>
</address>
© www.soinside.com 2019 - 2024. All rights reserved.