For Python使用Python解析XML

问题描述 投票:0回答:1

XML file

<?xml version="1.0"?>
 <productListing title="Python Products">
  <product id="1">
   <name>Python Hoodie</name>
   <description>This is a Hoodie</description>
   <cost>$49.99</cost>
   <shipping>$2.00</shipping>
  </product>
  <product id="2">
   <name>Python shirt</name>
   <description>This is a shirt</description>
   <cost>$79.99</cost>
   <shipping>$4.00</shipping>
  </product> 
  <product id="3">
   <name>Python cap</name>
   <description>This is a cap</description>
   <cost>$99.99</cost>
   <shipping>$3.00</shipping>
  </product> 
</productListing>

import xml.etree.ElementTree as et
import pandas as pd
import numpy as np

import all the libraries

tree = et.parse("documents/pythonstore.xml")

I put this file under documents

root = tree.getroot()
for a in range(3):
  for b in range(4):
     new=root[a][b].text
     print (new)

print out all the children in the XML.

df=pd.DataFrame(columns=['name','description','cost','shipping'])

created a dataframe to store all the children in XML

我的问题:

  • 如何将新变量转换为列表?我试过追加或列表功能,失败了。
  • 如何使用for循环将子项转换为数据框?

有人可以帮帮我!非常感谢!

python xml
1个回答
0
投票

这可能有所帮助。

# -*- coding: utf-8 -*-
s = """<?xml version="1.0"?>
 <productListing title="Python Products">
  <product id="1">
   <name>Python Hoodie</name>
   <description>This is a Hoodie</description>
   <cost>$49.99</cost>
   <shipping>$2.00</shipping>
  </product>
  <product id="2">
   <name>Python shirt</name>
   <description>This is a shirt</description>
   <cost>$79.99</cost>
   <shipping>$4.00</shipping>
  </product> 
  <product id="3">
   <name>Python cap</name>
   <description>This is a cap</description>
   <cost>$99.99</cost>
   <shipping>$3.00</shipping>
  </product> 
</productListing>"""

import xml.etree.ElementTree as et
tree = et.fromstring(s)
root = tree
res = []
for a in range(3):
    r = []
    for b in range(4):
        new=root[a][b].text
        r.append(new)
    res.append(r)

print res
df=pd.DataFrame(res, columns=['name','description','cost','shipping'])
print df

输出:

[['Python Hoodie', 'This is a Hoodie', '$49.99', '$2.00'], ['Python shirt', 'This is a shirt', '$79.99', '$4.00'], ['Python cap', 'This is a cap', '$99.99', '$3.00']]

            name       description    cost shipping
0  Python Hoodie  This is a Hoodie  $49.99    $2.00
1   Python shirt   This is a shirt  $79.99    $4.00
2     Python cap     This is a cap  $99.99    $3.00
© www.soinside.com 2019 - 2024. All rights reserved.