无法使用架构验证 XML,但可以通过从中读取写入的文件来工作

问题描述 投票:0回答:1

我当前正在使用

lxml
并且想要验证 XML 内容。

我完全用Python从

tei = etree.Element("TEI", nsmap={None: 'http://www.tei-c.org/ns/1.0'}
编写了它,有很多子元素。

此刻,我想使用以下代码使用特定的

.xsd
文件检查结构是否正确:

xmlschema_doc = etree.parse(xsd_file_path)
xmlschema = etree.XMLSchema(xmlschema_doc)
# run check
status = xmlschema.validate(xml_tree)

它返回 False 并出现错误

Element 'TEI': No matching global declaration available for the validation root.

我观察到一件非常奇怪的事情,如果我使用 编写 xml

ET = etree.ElementTree(xmlData)
ET.write('test.xml', pretty_print=True, xml_declaration=True, encoding='utf-8')

如果我用

b= etree.parse('test.xml')
重新打开它,我最终没有错误,并且由于
xmlschema.validate(b)

,xml结构是有效的

知道我需要在 xml 结构中添加什么吗?

编辑: 无效 XML 中的第一项

有效 XML 文件中的第一项

编辑:

<?xml version='1.0' encoding='UTF-8'?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">
  <text>
    <body>
      <listBibl>
        <biblFull>
          <titleStmt>
            <title xml:lang="en">article</title>
            <title xml:lang="fr">article</title>
            <title type="sub" xml:lang="en">A subtitle</title>
            <author role="aut">
              <persName>
                <forename type="first">John</forename>
                <surname>Doe</surname>
              </persName>
              <email>email</email>
              <idno type="http://orcid.org/">orcid</idno>
              <affiliation ref="#localStruct-affiliation"/>
              <affiliation ref="#struct-affiliation"/>
            </author>
            <author role="aut">
              <persName>
                <forename type="first">Jane</forename>
                <forename type="middle">Middle</forename>
                <surname>Doe</surname>
              </persName>
              <email>email</email>
              <idno type="http://orcid.org/">orcid</idno>
              <affiliation ref="#localStruct-affiliationA"/>
              <affiliation ref="#localStruct-affiliationB"/>
            </author>
          </titleStmt>
          <editionStmt>
            <edition>
              <ref type="file" subtype="author" n="1" target="upload.pdf"/>
            </edition>
          </editionStmt>
          <publicationStmt>
            <availability>
              <licence target="https://creativecommons.org/licenses//cc-by/"/>
            </availability>
          </publicationStmt>
          <notesStmt>
            <note type="audience" n="2"/>
            <note type="invited" n="1"/>
            <note type="popular" n="0"/>
            <note type="peer" n="1"/>
            <note type="proceedings" n="0"/>
            <note type="commentary">small comment</note>
            <note type="description">small description</note>
          </notesStmt>
          <sourceDesc>
            <biblStruct>
              <analytic>
                <title xml:lang="en">article</title>
                <title xml:lang="fr">article</title>
                <title type="sub" xml:lang="en">A subtitle</title>
                <author role="aut">
                  <persName>
                    <forename type="first">John</forename>
                    <surname>Doe</surname>
                  </persName>
                  <email>email</email>
                  <idno type="http://orcid.org/">orcid</idno>
                  <affiliation ref="#localStruct-affiliation"/>
                  <affiliation ref="#struct-affiliation"/>
                </author>
                <author role="aut">
                  <persName>
                    <forename type="first">Jane</forename>
                    <forename type="middle">Middle</forename>
                    <surname>Doe</surname>
                  </persName>
                  <email>email</email>
                  <idno type="http://orcid.org/">orcid</idno>
                  <affiliation ref="#localStruct-affiliationA"/>
                  <affiliation ref="#localStruct-affiliationB"/>
                </author>
              </analytic>
              <monogr>
                <idno type="isbn">978-1725183483</idno>
                <idno type="halJournalId">117751</idno>
                <idno type="issn">xxx</idno>
                <imprint>
                  <publisher>springer</publisher>
                  <biblScope unit="serie">a special collection</biblScope>
                  <biblScope unit="volume">20</biblScope>
                  <biblScope unit="issue">1</biblScope>
                  <biblScope unit="pp">10-25</biblScope>
                  <date type="datePub">2024-01-01</date>
                </imprint>
              </monogr>
              <series/>
              <idno type="doi">reg</idno>
              <idno type="arxiv">ger</idno>
              <idno type="bibcode">erg</idno>
              <idno type="ird">greger</idno>
              <idno type="pubmed">greger</idno>
              <idno type="ads">gaergezg</idno>
              <idno type="pubmedcentral">gegzefdv</idno>
              <idno type="irstea">vvxc</idno>
              <idno type="sciencespo">gderg</idno>
              <idno type="oatao">gev</idno>
              <idno type="ensam">xcvcxv</idno>
              <idno type="prodinra">vxcv</idno>
              <ref type="publisher">https://publisher.com/ID</ref>
              <ref type="seeAlso">https://link1.com/ID</ref>
              <ref type="seeAlso">https://link2.com/ID</ref>
              <ref type="seeAlso">https://link3.com/ID</ref>
            </biblStruct>
          </sourceDesc>
          <profileDesc>
            <textClass>
              <keywords scheme="author">
                <term xml:lang="en">keyword1</term>
                <term xml:lang="en">keyword2</term>
                <term xml:lang="fr">mot-clé1</term>
                <term xml:lang="fr">mot-clé2</term>
              </keywords>
              <classCode scheme="halDomain" n="physics"/>
              <classCode scheme="halDomain" n="halDomain2"/>
              <classCode scheme="halTypology" n="ART"/>
            </textClass>
          </profileDesc>
        </biblFull>
      </listBibl>
    </body>
    <back>
      <listOrg type="structures">
        <org type="institution" xml:id="localStruct-affiliation">
          <orgName>laboratory for MC, university of Yeah</orgName>
          <orgName type="acronym">LMC</orgName>
          <desc>
            <address>
              <addrLine>Blue street 155, 552501 Olso, Norway</addrLine>
              <country key="LS">Lesotho</country>
            </address>
            <ref type="url" target="https://lmc.univ-yeah.com"/>
          </desc>
        </org>
        <org type="institution" xml:id="localStruct-affiliationB">
          <orgName>laboratory for MCL, university of Yeah</orgName>
          <orgName type="acronym">LMCL</orgName>
          <desc>
            <address>
              <addrLine>Blue street 155, 552501 Olso, Norway</addrLine>
              <country key="NO">Norway</country>
            </address>
            <ref type="url" target="https://lmcl.univ-yeah.com"/>
          </desc>
        </org>
      </listOrg>
    </back>
  </text>
</TEI>

python xml xsd lxml
1个回答
0
投票

看看https://lxml.de/tutorial.html#namespaces,你基本上应该使用

TEI_NAMESPACE = "http://www.w3.org/1999/xhtml"
TEI = "{%s}" % TEI_NAMESPACE

NSMAP = {None : TEI_NAMESPACE} # the default namespace (no prefix)

root = etree.Element(TEI + "TEI", nsmap=NSMAP) # lxml only!
text = etree.SubElement(root, TEI + "text")

对所有元素依此类推,以确保它们是在 TEI 命名空间中创建的。

© www.soinside.com 2019 - 2024. All rights reserved.