我有一个XML area.xml
<area>
<controls>
<internal>yes</internal>
</controls>
<schools>
<school id="001"/>
<time>2020-05-18T14:21:00Z</time>
<venture index="5">
<venture>
<basicData type="class">
<wage numberOfDollars="13" Correction="4.61">
<tax>70</tax>
</wage>
</basicData>
</venture>
</venture>
<venture index="9">
<venture>
<basicData type="class">
<wage numberOfDollars="13" Correction="5.61">
<tax>70</tax>
</wage>
</basicData>
</venture>
</venture>
<school id="056"/>
<time>2020-05-18T14:21:00Z</time>
<venture index="5">
<venture>
<basicData type="class">
<wage numberOfDollars="13">
<tax>70</tax>
</wage>
</basicData>
</venture>
</venture>
<venture index="9">
<venture>
<basicData type="class">
<wage numberOfDollars="13">
<tax>70</tax>
</wage>
</basicData>
</venture>
</venture>
</schools>
我正在尝试使用Python实现的目标:在一个学校节点中,有多个工资节点(叶)。如果工资节点(假期)(1个或更多)具有一个名为“校正”的属性,我希望获得学校节点的属性值。
因此,我的脚本的结果应该是:001,因为这所学校在工资节点(离开)中具有校正属性”>
首先我尝试使用ETree进行尝试
import xml.etree.ElementTree as ET data_file = 'area.xml' tree = ET.parse(data_file) root = tree.getroot() t1 = "school" t2 = "wage" for e1, e2 in zip(root.iter(t1), root.iter(t2)): if hasattr(e2,'Correction'): e2.Correction print (e1.attrib['id'])
但是那没有用。现在,我试图通过迷你来达到我的目标但我觉得很难。
到目前为止,这是我的代码:
from xml.dom import minidom doc = minidom.parse("area.xml") staffs = doc.getElementsByTagName("wage") for wage in staffs: sid = wage.getAttribute("Correction") print("wage:%s" % (sid))
输出给出了工资属性的所有值更正:
wage:4.61 wage:5.61 wage: wage:
显然显然不正确。
我可以使用一些帮助使我朝正确的方向发展
我正在使用python 3
先谢谢您
我有一个XML area.xml
在一个学校节点中,有多个工资节点
from simplified_scrapy import SimplifiedDoc, req, utils
html = utils.getFileContent("area.xml")
doc = SimplifiedDoc(html)
schools = doc.selects('school') # Get all schools
n = len(schools)
i = 0
while i < n - 1:
school = schools[i]
school1 = schools[i + 1]
h = doc.html[school._end:school1._start] # Get data between two schools
staffs = doc.getElementsByReg(' Correction="', tag='wage', html=h)
if staffs:
print(school.id, staffs.Correction)
i += 1
last = schools[n - 1]
h = doc.html[last._end:]
staffs = doc.getElementsByReg(' Correction="', tag='wage', html=h)
if staffs:
print(last.id, staffs.Correction)
结果: