如何使用lxml获取XML声明字符串

问题描述 投票:1回答:2

我使用lxml来解析XML文档如何获取声明字符串?

 <?xml version="1.0" encoding="utf-8" ?> 

我想检查它是否存在,它具有什么编码以及xml版本。

python xml lxml
2个回答
2
投票

解析文档时,生成的ElementTree对象应该有一个DocInfo对象,其中包含有关解析的XML或HTML文档的信息。

对于XML,您可能对此xml_versionencodingDocInfo属性感兴趣:

>>> from lxml import etree
>>> tree = etree.parse('input.xml')
>>> tree.docinfo
<lxml.etree.DocInfo object at 0x7f8111f9ecc0>
>>> tree.docinfo.xml_version
'1.0'
>>> tree.docinfo.encoding
'UTF-8'

0
投票

也许您应该检查是否可以在XML文件中找到具有该声明值()的字符串:

    def matchLine(path, line_number, text):
        """
        path = used for defining the file to be checked
        line_number = used to identify the line that  will be checked
        text = string containing the text to match
        """
        file = open(path)
        line_file = file.readline()
        line_file = line_file.rstrip()
        line_no = 1
        while line_file != "":
            if line_no == line_number:
                if line_file == text:
                    return True
                else:
                    return False
            line_no = line_no+1
            line_file = file.readline()
            line_file = line_file.rstrip()
© www.soinside.com 2019 - 2024. All rights reserved.