所以我已经用lxml
解析了一个xml文件。
import from lxml import etree
In [5]: root = etree.parse(fileXml)
In [6]: root
Out[6]: <lxml.etree._ElementTree at 0x7f2fa63ae388>
如您所见,该对象包含115139条记录...或者至少是我的理解...
In [21]: len(root.getroot())
Out[21]: 115139
如果获得第一个,我确实会看到一些期望的字段:
In [11]: root.getroot()[0].getchildren()
Out[11]:
[<Element {xmlapi_1.0}receivedOctetsPeriodic at 0x7f2fa815df88>,
<Element {xmlapi_1.0}transmittedOctetsPeriodic at 0x7f2fa5b99508>,
<Element {xmlapi_1.0}inputSpeed at 0x7f2fa5b994c8>,
<Element {xmlapi_1.0}outputSpeed at 0x7f2fa5b99408>,
<Element {xmlapi_1.0}timeCaptured at 0x7f2fa5b99348>,
<Element {xmlapi_1.0}periodicTime at 0x7f2fa5b99148>,
<Element {xmlapi_1.0}displayedName at 0x7f2fa5b99088>,
<Element {xmlapi_1.0}monitoredObjectSiteName at 0x7f2fa5b94f48>]
例如,如何检索字段displayedName
...?
例如,我尝试了attrib.get
,但没有成功:
In [35]: root.getroot()[0].attrib.get('displayedName').text
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-35-653606973e5d> in <module>
----> 1 root.getroot()[0].attrib.get('displayedName').text
AttributeError: 'NoneType' object has no attribute 'text'
这是我要解析的文件的摘录:
<findToFileResponse xmlns="xmlapi_1.0">
<equipment.SystemCpuMonStats>
<tmnxSysCpuMonCpuIdle>78.38</tmnxSysCpuMonCpuIdle>
<tmnxSysCpuMonBusyCoreUtil>21.61</tmnxSysCpuMonBusyCoreUtil>
<timeCaptured>1587078916040</timeCaptured>
<children-Set/>
</equipment.SystemCpuMonStats>
谢谢!
例如,使用此xml:https://www.w3schools.com/xml/simple.xml
root = etree.parse("path")
for object in root.findall('food'):
name = object.find('name')
price = object.find('price')
calories = object.find('calories')
table = {'name': name.text, 'price': price.text, 'calories': calories}
#Here we changed all the names in the xml to "hi"
name.text ="hi"
print(table)
#Here we save the new xml
root.write("path.xml")
作为输出,我们将得到
{'name': 'Belgian Waffles', 'price': '$5.95', 'calories': <Element 'calories' at 0x000002704243A368>}
{'name': 'Strawberry Belgian Waffles', 'price': '$7.95', 'calories': <Element 'calories' at 0x000002704243A548>}
{'name': 'Berry-Berry Belgian Waffles', 'price': '$8.95', 'calories': <Element 'calories' at 0x000002704243A728>}
{'name': 'French Toast', 'price': '$4.50', 'calories': <Element 'calories' at 0x000002704243A8B8>}
{'name': 'Homestyle Breakfast', 'price': '$6.95', 'calories': <Element 'calories' at 0x000002704243AA98>}
我希望这会有所帮助!