xml.etree.ElementTree 未解析某一特定属性值

Question

数据文件testfile.xml是这样的：

<?xml version="1.0" encoding="utf-8"?>
<body>
  <body.head>
    <hedline>
      <hl1 style="header">All the things we lost that summer</hl1>
      <hl2 style="standfirst">It was the promise of seals that sold Virginia on this mission.</hl2>
      <hl2 style="dropcap-large"><em class="dropcap">W</em>e are always calling each other names.</hl2>
    </hedline>
  </body.head>
</body>

解析该文件的脚本是这样的：

import xml.etree.ElementTree as ET
tree = ET.parse('testfile.xml')
root = tree.getroot()
if root.find('body.head') is not None:
    if root.find('body.head').find('hedline') is not None:
        for child1 in root.find('body.head').find('hedline'):
            print("Tag    level 1:" + child1.tag)
            print("Attrib level 1:" + str(child1.attrib))
            print("Text   level 1:" + str(child1.text) + "\n")
            for child2 in child1:
                print("Tag    level 2:" + child2.tag)
                print("Attrib level 2:" + str(child2.attrib))
                print("Text   level 2:" + str(child2.text))

这就是结果：

Tag    level 1:hl1
Attrib level 1:{'style': 'header'}
Text   level 1:All the things we lost that summer

Tag    level 1:hl2
Attrib level 1:{'style': 'standfirst'}
Text   level 1:It was the promise of seals that sold Virginia on this mission.

Tag    level 1:hl2
Attrib level 1:{'style': 'dropcap-large'}
Text   level 1:None  <-- THIS IS THE PROBLEM

Tag    level 2:em
Attrib level 2:{'class': 'dropcap'}
Text   level 2:W

我希望报告行“文本级别 1：”报告值“e 总是互相称呼对方的名字”。来自数据文件，但它无法解析它，因此它最终为 None。你能正确解析它吗？这是 Windows 上的 Python 3.12。

谢谢，马丁

Answer 1

那是因为在 ElementTree（和 lxml）中，这是

.tail

元素的

em

。

.text

属性仅包含第一个子文本节点。

请参阅

tail

此处了解更多信息。

xml.etree.ElementTree 未解析某一特定属性值

问题描述投票：0回答：1

1个回答

最新问题

xml.etree.ElementTree 未解析某一特定属性值

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1