我使用lxml
来解析XML文档如何获取声明字符串?
<?xml version="1.0" encoding="utf-8" ?>
我想检查它是否存在,它具有什么编码以及xml版本。
解析文档时,生成的ElementTree
对象应该有一个DocInfo
对象,其中包含有关解析的XML或HTML文档的信息。
对于XML,您可能对此xml_version
的encoding
和DocInfo
属性感兴趣:
>>> from lxml import etree
>>> tree = etree.parse('input.xml')
>>> tree.docinfo
<lxml.etree.DocInfo object at 0x7f8111f9ecc0>
>>> tree.docinfo.xml_version
'1.0'
>>> tree.docinfo.encoding
'UTF-8'
也许您应该检查是否可以在XML文件中找到具有该声明值()的字符串:
def matchLine(path, line_number, text):
"""
path = used for defining the file to be checked
line_number = used to identify the line that will be checked
text = string containing the text to match
"""
file = open(path)
line_file = file.readline()
line_file = line_file.rstrip()
line_no = 1
while line_file != "":
if line_no == line_number:
if line_file == text:
return True
else:
return False
line_no = line_no+1
line_file = file.readline()
line_file = line_file.rstrip()