想要从内部标签中提取名称属性的值,并在名称标签(如果存在)后面附加组名称。我尝试使用 xml.etree.ElementTree 进行提取,但我的代码没有给出预期的输出。
输入XML
<abtshop>
<dDirectory>dub</dDirectory>
<S>statusd</S>
<work>worklogs</work>
<custs>
<cust>nim-us</cust>
</custs>
<mileage>999</mileage>
<defaults>
<default type="mercley">
<user>dairy</user>
<exec>slm.sh</exec>
<env>
<var name="SAN_HOME">youyou-11</var>
</env>
</default>
</defaults>
<inter name="nim_turk" first-day="20230301" historical="20220103" market="multi">
<works>
<work kind="obopay" run="jbs">
<args>
<arg name="distance">180000</arg>
</args>
</work>
<work kind="silkb" run="jbs">
<args>
<arg name="distance">180000</arg>
</args>
</work>
</works>
</inter>
<inter name="nim_us_m" first-day="20230301" historical="20220103" market="lone">
<works>
<work kind="obopay" run="jbs" groups="groupA,groupB">
<args>
<arg name="distance">120000</arg>
<arg name="jbsopt">xmas_size=1200000</arg>
<arg name="jbsopt">of_obopaying_threads=2</arg>
</args>
</work>
<work kind="silkb" run="jbs" groups="groupA,groupB">
<args>
<arg name="distance">120000</arg>
<arg name="jbsopt">xmas_size=1200000</arg>
</args>
</work>
</works>
</inter>
</inters>
</abtshop>
我尝试了下面的代码来提取值
尝试过代码
tree=ET.parse('test.xml')
root = tree.getroot()
xm_subs=[]
for subn in root.findall(".//inter/works/work[@run='jbs'][@kind='obopay']/../.."):
sname=subn.attrib["name"]
for subg in root.findall(".//inter[@name='%s']/jobs/job[@run='jbs'][@kind='obopay'][@groups]" % sname):
groups=subg.attrib['groups']
for gname in groups.split(","):
sub_name=subn.attrib["name"] + "-" + gname
xm_subs.append(sub_name)
else:
print sname
print subn.attrib["name"]
xm_subs.append(subn.attrib["name"])
return xm_subs
所需输出
['nim_turk','nim_turk-groupA','nim_turk-groupB']
第二个 findall 可以从找到的元素而不是根应用。另外
[@groups]
被添加到第一个谓词
import xml.etree.ElementTree as ET
tree=ET.parse('tmp2')
root = tree.getroot()
xm_subs=[]
for subn in root.findall(".//inter/works/work[@run='jbs'][@kind='obopay'][@groups]/../.."):
sname=subn.attrib["name"]
for subg in subn.findall("./works/work[@run='jbs'][@kind='obopay']"):
groups=subg.attrib['groups']
for gname in groups.split(","):
sub_name=subn.attrib["name"] + "-" + gname
xm_subs.append(sub_name)
else:
print (subn.attrib["name"])
xm_subs.append(subn.attrib["name"])
print(xm_subs)
结果
['nim_us_m-groupA', 'nim_us_m-groupB', 'nim_us_m']