python xml 删除孙子或曾孙

问题描述 投票:0回答:3

我一直在谷歌搜索从 xml 文件中删除孙子。但是,我还没有找到完美的解决方案。 这是我的案例:

<tree>
    <category title="Item 1">item 1 text
        <subitem title="subitem1">subitem1 text</subitem>
        <subitem title="subitem2">subitem2 text</subitem>
    </category>

    <category title="Item 2">item 2 text
        <subitem title="subitem21">subitem21 text</subitem>
        <subitem title="subitem22">subitem22 text</subitem>
            <subsubitem title="subsubitem211">subsubitem211 text</subsubitem>
    </category>
</tree>

在某些情况下,我想删除

subitem
。在其他情况下,我想删除
subsubitem
。我知道我可以在当前给定的内容中这样做:

import xml.etree.ElementTree as ET

root = ET.fromstring(given_content)
# case 1
for item in root.getiterator():
    for subitem in item:
        item.remove(subitem)

# case 2
for item in root.getiterator():
    for subitem in item:
        for subsubitem in subitem:
            subitem.remove(subsubitem)

只有当我知道目标节点的深度时,我才能用这种风格编写。如果我只知道要删除的节点的标签名称,我应该如何实现? 伪代码: import xml.etree.ElementTree as ET for item in root.getiterator(): if item.tag == 'subsubitem' or item.tag == 'subitem': # remove item

如果我这样做
root.remove(item)

,它肯定会返回错误,因为 item 不是

root
的直接子项。

编辑: 我无法安装任何 3rd-party-lib,所以我必须用

xml

来解决这个问题。

    

python xml removechild
3个回答
3
投票
xml

lib 上完成了这项工作。


def recursive_xml(root): if root.getchildren() is not None: for child in root.getchildren(): if child.tag == 'subitem' or child.tag == 'subsubitem': root.remove(child) else: recursive_xml(child)

通过这样做,该函数将迭代 ET 中的每个节点并删除我的目标节点。

test_xml = r''' <test> <test1> <test2> <test3> </test3> <subsubitem> </subsubitem> </test2> <subitem> </subitem> <nothing_matters> </nothing_matters> </test1> </test> ''' root = ET.fromstring(test_xml) recursive_xml(root)

希望这可以帮助像我这样有限制要求的人....


1
投票
subsubitem

subitem
的实例,无论其深度如何,请考虑以下示例(需要注意的是,它使用
lxml.etree
 而不是上游 ElementTree):
import lxml.etree as etree el = etree.fromstring('<root><item><subitem><subsubitem/></subitem></item></root>') for child in el.xpath('.//subsubitem | .//subitem'): child.getparent().remove(child)



0
投票
https://docs.python.org/3/library/xml.etree.elementtree.html#element-objects

https://docs.python.org/3/library/xml.etree.elementtree.html#modifying-an-xml-file

请注意,迭代时并发修改可能会导致 问题,就像迭代和修改 Python 列表或 听写。

此脚本将在 3.10 上运行:

#!/usr/bin/python import xml.etree.ElementTree as ET def print_xmltree(root): xmlstr = ET.tostring(root, encoding="utf-8", method="xml") print(xmlstr.decode("utf-8")) def recursive_xml(parent, depth): #print(depth * " ", parent.findall('./')) for child in parent.findall('./'): if child.tag == 'subitem' or child.tag == 'subsubitem': parent.remove(child) else: recursive_xml(child, depth + 1) xml_data = """<?xml version="1.0" encoding="UTF-8"?> <tree> <category title="Item 1">item 1 text <subitem title="subitem1">subitem11 text</subitem> <subitem title="subitem2">subitem12 text</subitem> <sibetum title="subitem3">subitem13 text</sibetum> <subsubitem title="subsubitem1">subsubitem211 text</subsubitem> </category> <category title="Item 2">item 2 text <subitem title="subitem1">subitem21 text</subitem> <subitem title="subitem2">subitem22 text</subitem> <subsubitem title="subsubitem1">subsubitem211 text</subsubitem> <sobsobitem title="subsubitem2">wrong tag</sobsobitem> </category> <category title="Item 3">item 3 text </category> </tree>""" #root = ET.parse('test.xml').getroot() # from file root = ET.fromstring(xml_data) # from variable recursive_xml(root, 0) print_xmltree(root) # Note that sobsobitem was forced up in hierachy and that parent tag for subsubitem did not matter (sibetum). wait = input("Press Enter to Exit.")

这将输出:

<tree> <category title="Item 1">item 1 text <sibetum title="subitem3">subitem13 text</sibetum> </category> <category title="Item 2">item 2 text <sobsobitem title="subsubitem2">wrong tag</sobsobitem> </category> <category title="Item 3">item 3 text </category> </tree>

© www.soinside.com 2019 - 2024. All rights reserved.