Python SAX解析器:resolveEntity

问题描述 投票:1回答:1

我很难弄清楚如何将自己的ResolveEntityHandler绑定到SAX解析器。在SO上有this answer。但是很遗憾,我无法在此处重现结果。

[当我运行以下代码时,实际上是从上述答案中复制而来,刚刚更新为Python 3,

import io
import xml.sax
from xml.sax.handler import ContentHandler

# Inheriting from EntityResolver and DTDHandler is not necessary
class TestHandler(ContentHandler):

    # This method is only called for external entities. Must return a value.
    def resolveEntity(self, publicID, systemID):
        print ("TestHandler.resolveEntity(): %s %s" % (publicID, systemID))
        return systemID

    def skippedEntity(self, name):
        print ("TestHandler.skippedEntity(): %s" % (name))

    def unparsedEntityDecl(self, name, publicID, systemID, ndata):
        print ("TestHandler.unparsedEntityDecl(): %s %s" % (publicID, systemID))

    def startElement(self, name, attrs):
        summary = attrs.get('summary', '')
        print ('TestHandler.startElement():', summary)

def main(xml_string):
    try:
        parser = xml.sax.make_parser()
        curHandler = TestHandler()
        parser.setContentHandler(curHandler)
        parser.setEntityResolver(curHandler)
        parser.setDTDHandler(curHandler)

        stream = io.StringIO(xml_string)
        parser.parse(stream)
        stream.close()
    except xml.sax.SAXParseException as e:
        print ("ERROR %s" % e)

XML = """<!DOCTYPE test SYSTEM "test.dtd">
<test summary='step: &num;'>Entity: &not;</test>
"""

main(XML)

和外部test.dtd

<!ENTITY num "FOO">
<!ENTITY pic SYSTEM 'bar.gif' NDATA gif>

我得到的是

TestHandler.startElement(): step: 
TestHandler.skippedEntity(): not

Process finished with exit code 0

所以我的问题是:

  1. 为什么resolveEntity从未被呼叫?
  2. 如何将ResolveEntityHandler绑定到您的解析器?
python xml sax entityresolver
1个回答
0
投票

您看到的与change in Python 3.7.1有关:

版本3.7.1中的更改​​:默认情况下,SAX解析器不再处理常规外部实体以提高安全性。之前,解析器创建了网络连接以从DTD和实体的文件系统中获取远程文件或加载本地文件。可以使用解析器对象上的方法setFeature()和参数feature_external_ges再次启用该功能。

要获得与早期版本相同的行为,请添加以下行:

from xml.sax.handler import feature_external_ges

和(在main功能中)

parser.setFeature(feature_external_ges, True)
© www.soinside.com 2019 - 2024. All rights reserved.