ElementTree:使用findall提取属性值

问题描述 投票:0回答:1

想要从内部标签中提取名称属性的值,并在名称标签(如果存在)后面附加组名称。我尝试使用 xml.etree.ElementTree 进行提取,但我的代码没有给出预期的输出。

输入XML

<abtshop>
    <dDirectory>dub</dDirectory>
    <S>statusd</S>
    <work>worklogs</work>
    <custs>
        <cust>nim-us</cust>
    </custs>

    <mileage>999</mileage>

    <defaults>
        <default type="mercley">
            <user>dairy</user>
            <exec>slm.sh</exec>
            <env>
                <var name="SAN_HOME">youyou-11</var>
            </env>
        </default>
    </defaults>
        <inter name="nim_turk" first-day="20230301" historical="20220103" market="multi">
            <works>
                <work kind="obopay" run="jbs">
                    <args>
                        <arg name="distance">180000</arg>
                    </args>
                </work>
                <work kind="silkb" run="jbs">
                    <args>
                        <arg name="distance">180000</arg>
                    </args>
                </work>
            </works>
        </inter>
        <inter name="nim_us_m" first-day="20230301" historical="20220103" market="lone">
            <works>
                <work kind="obopay" run="jbs" groups="groupA,groupB">
                    <args>
                        <arg name="distance">120000</arg>
                        <arg name="jbsopt">xmas_size=1200000</arg>
                        <arg name="jbsopt">of_obopaying_threads=2</arg>
                    </args>
                </work>
                <work kind="silkb" run="jbs" groups="groupA,groupB">
                    <args>
                        <arg name="distance">120000</arg>
                        <arg name="jbsopt">xmas_size=1200000</arg>
                    </args>
                </work>
            </works>
        </inter>
    </inters>
</abtshop>

我尝试了下面的代码来提取值

尝试过代码

tree=ET.parse('test.xml')
root = tree.getroot()
xm_subs=[]
for subn in root.findall(".//inter/works/work[@run='jbs'][@kind='obopay']/../.."):
        sname=subn.attrib["name"]
        for subg in root.findall(".//inter[@name='%s']/jobs/job[@run='jbs'][@kind='obopay'][@groups]" % sname):
                        groups=subg.attrib['groups']
                        for gname in groups.split(","):
                                sub_name=subn.attrib["name"] + "-" + gname
                                xm_subs.append(sub_name)

        else:
                print sname
                print subn.attrib["name"]
                xm_subs.append(subn.attrib["name"])
return xm_subs

所需输出

['nim_turk','nim_turk-groupA','nim_turk-groupB']
python xml elementtree
1个回答
0
投票

第二个 findall 可以从找到的元素而不是根应用。另外

[@groups]
被添加到第一个谓词

import xml.etree.ElementTree as ET
tree=ET.parse('tmp2')
root = tree.getroot()
xm_subs=[]
for subn in root.findall(".//inter/works/work[@run='jbs'][@kind='obopay'][@groups]/../.."):
        sname=subn.attrib["name"]
        for subg in subn.findall("./works/work[@run='jbs'][@kind='obopay']"):
                        groups=subg.attrib['groups']
                        for gname in groups.split(","):
                                sub_name=subn.attrib["name"] + "-" + gname
                                xm_subs.append(sub_name)

        else:
                print (subn.attrib["name"])
                xm_subs.append(subn.attrib["name"])
print(xm_subs)

结果

['nim_us_m-groupA', 'nim_us_m-groupB', 'nim_us_m']
© www.soinside.com 2019 - 2024. All rights reserved.