注释 <xref> 标签并保持文本为纯文本,如果部分/@id 与文件夹内 xml 或文件夹外 xml 文件不匹配

问题描述 投票:0回答:1

如果

<xref>
在同一文件夹 xml 文件或其他文件夹 xml 文件中不匹配,我们要注释
section/@id
标签:

每个文件夹都有多个/单个 xml 文件,并且

<xref>
section/@id
标签将位于所有 xml 文件中

  1. <xref href="aag-dep1_1.01">some text here</xref>
    文件夹中的此
    "aat-fti"
    标签但位于
    eat-rw
    文件夹中的部分,如果
    xref/@href
    section/@id
    都匹配,则保留值
    xref
    标签不变,不匹配则注释
    xref 
    标签使文本保持简单。

请帮忙推荐,谢谢

请参阅下面的屏幕截图中的文件夹结构,每个文件夹都有 xml 文件,但

xref/@href
sectin/@id
文档中包含任何 xml 文件:

section/@id
文件夹中用
aa-fti
输入xml

<?xml version="1.0" encoding="UTF-8"?>
<book id="book_id">
    <title>Generally Accepted Accounting Principles</title>
    <chapter id="chapter_id" role="ls_level_2">
        <chapterinfo>
            <titleabbrev>Chapter_abb</titleabbrev>
            <title>Chapter_title</title>
        </chapterinfo>
        <section id="aag-dep1_01">
            <para>text here</para>
            <para>text here</para>
            <para>containing auditing <xref href="fir_56_10">some text here</xref> guidance related to generally accepted auditing standard</para>
            <para> the effective dates for FASB ASU No. 2018 <xref href="aag-dep1_1.01">some text here</xref></para>
            <section id="aag-dep1_1.01">
                <para>text here <xref href="fot_79_ut">some text here</xref></para>
                <para>text here</para>
                <para>text here<xref href="aag-dep1_01">some text here</xref></para>
            </section>
            <section id="aag-dep1_2.01">
                <para>text here</para>
                <para>text here</para>
                <para>text here <xref href="aag-dep1_02">some text here</xref></para>
            </section>
        </section>
        <section id="aag-dep1_02">
            <para>text here</para>
            <para>text here</para>
            <para>ces, including engagements for entities in specia</para>
            <para>example, a large calendar-year public insurance en</para>
            <section id="aag-dep1_1.02">
                <para>text here</para>
                <para>text here</para>
                <para>text here <xref href="tih52_23">some text here</xref></para>
            </section>
        </section>
        <section id="aag-dep1_regulation_and_oversight">
            <para>text <xref href="aag-dep1_1.02">some text here</xref> here</para>
            <para>text here</para>
            <para>early application may do so as of the beginning</para>
            <para>Other auditing publications have no authoritative status;</para>
            <section id="aag-dep1_08">
                <para>text here <xref href="aag-dep1_regulation_and_oversight">some text here</xref></para>
                <para>text <xref href="nov1_22">some text here</xref> here</para>
                <para>text here</para>
            </section>
        </section>
    </chapter>
</book>

xref/@href
文件夹中用
eat-rw
输入xml文件

<?xml version="1.0" encoding="UTF-8"?>
<book id="book_id">
    <title>Generally Accepted Accounting Principles</title>
    <chapter id="chapter_id" role="ls_level_2">
        <chapterinfo>
            <titleabbrev>Chapter_abb</titleabbrev>
            <title>Chapter_title</title>
        </chapterinfo>
        <section id="aag-dep1_01">
            <para>text here</para>
            <para>text here</para>
            <para>containing auditing <!--<xref href="fir_56_10">-->some text here<!--</xref>--> guidance related to generally accepted auditing standard</para>
            <para> the effective dates for FASB ASU No. 2018 <xref href="aag-dep1_1.01">some text here</xref></para>
            <section id="aag-dep1_1.01">
                <para>text <!--<xref href="fot_79_ut">-->some text here<!--</xref>--> here</para>
                <para>text here</para>
                <para>text here<xref href="aag-dep1_01">some text here</xref></para>
            </section>
            <section id="aag-dep1_2.01">
                <para>text here</para>
                <para>text here</para>
                <para>text here <xref href="aag-dep1_02">some text here</xref></para>
            </section>
        </section>
        <section id="aag-dep1_02">
            <para>text here</para>
            <para>text here</para>
            <para>ces, including engagements for entities in specia</para>
            <para>example, a large calendar-year public insurance en</para>
            <section id="aag-dep1_1.02">
                <para>text here</para>
                <para>text here</para>
                <para>text here <!--<xref href="tih52_23">-->some text here<!--</xref>--></para>
            </section>
        </section>
        <section id="aag-dep1_regulation_and_oversight">
            <para>text <xref href="aag-dep1_1.02">some text here</xref> here</para>
            <para>text here</para>
            <para>early application may do so as of the beginning</para>
            <para>Other auditing publications have no authoritative status;</para>
            <section id="aag-dep1_08">
                <para>text here <xref href="aag-dep1_regulation_and_oversight">some text here</xref></para>
                <para>text <!--<xref href="nov1_22">-->some text here<!--</xref>--> here</para>
                <para>text here</para>
            </section>
        </section>
    </chapter>
</book>
xml xslt-1.0 saxon dtd xslt-3.0
1个回答
0
投票

将 XSLT 3 与 Saxon 9.9 结合使用,如果您将以下 XSLT 放入您显示的所有子文件夹的父文件夹中,它将递归处理所有

*.xml
文件并将转换后的结果写入例如
subfoldername-output

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="3.0"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  exclude-result-prefixes="#all"
  expand-text="yes">
  
  <xsl:param name="collection-uri" select="'?select=*.xml;recurse=yes'"/>
  
  <xsl:param name="collection-docs" select="collection($collection-uri)"/>
  
  <xsl:key name="section" match="section" use="@id"/>

  <xsl:mode on-no-match="shallow-copy"/>

  <xsl:template name="xsl:initial-template">
    <xsl:apply-templates select="$collection-docs"/>
  </xsl:template>
  
  <xsl:template match="/">
    <xsl:variable name="result-uri" select="let $uri-tokens := tokenize(base-uri(), '/') return string-join(($uri-tokens[position() lt last() - 1], $uri-tokens[last() - 1] || '-output', $uri-tokens[last()]), '/')"/>
    <xsl:result-document href="{$result-uri}">
      <xsl:apply-templates/>
    </xsl:result-document>
  </xsl:template>

  <xsl:template match="xref[not(some $doc in $collection-docs satisfies key('section', @href, $doc))]">
    <xsl:comment>&lt;xref href="<xsl:value-of select="@href"/>"&gt;</xsl:comment><xsl:apply-templates/><xsl:comment>&lt;/xref&gt;</xsl:comment>
  </xsl:template>
  
</xsl:stylesheet>

使用初始模板开始处理,没有输入文件。我认为该方法应该适用于简单的文件夹结构(例如,带有 XSLT 的父文件夹包含一层不同的子文件夹,其中包含要处理的 XML 文档),但请确保在完整使用它之前对一些测试示例数据进行仔细测试。文件夹集。

© www.soinside.com 2019 - 2024. All rights reserved.