使用 xmlstarlet 转义 xml 文件

问题描述 投票:0回答:3

我有两个 xml 文件 A 和 B。我想在 A 到 B 中插入某些元素节点。一切顺利,但导入的元素在 B 中不是“unescape”:

XML文件A:

<?xml version="1.0"?>
<article>
  <publication>
     <authors>
      <author>
        <givenname locale="en_US">Admin</givenname>
        <givenname locale="nb_NO">Admin</givenname>
        <familyname locale="en_US">Septentrio</familyname>
        <email>[email protected]</email>
      </author>
      <author include_in_browse="true" user_group_ref="Forfatter" seq="0" id="13075">
        <givenname locale="nb_NO">Peter</givenname>
        <familyname locale="nb_NO">Nilsen</familyname>
        <affiliation locale="nb_NO">NTL University</affiliation>
        <country>NO</country>
        <email>[email protected]</email>
      </author>
    </authors>
  </publication>
</article>

XML file B:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.2 20190208//EN" "JATS-archivearticle1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="1.2" article-type="other">
  <front>
  </front>
</article>

当我跑步时:

xmlstarlet ed -L -s "//article/front" -t elem -n "newchild"  -v "$(xmlstarlet  sel  --omit-decl  -t -c "/article/publication/authors/author" a.xml)" b.xml

它在 B 文件中给了我以下内容:

<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.2 20190208//EN" "JATS-archivearticle1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="1.2" article-type="other">
  <front>
  <newchild>&lt;author&gt;
        &lt;givenname locale="en_US"&gt;Admin&lt;/givenname&gt;
        &lt;givenname locale="nb_NO"&gt;Admin&lt;/givenname&gt;
        &lt;familyname locale="en_US"&gt;Septentrio&lt;/familyname&gt;
        &lt;email&gt;[email protected]&lt;/email&gt;
        &lt;/author&gt;&lt;author include_in_browse="true" user_group_ref="Forfatter" seq="0" id="13075"&gt;
        &lt;givenname locale="nb_NO"&gt;Peter&lt;/givenname&gt;
        &lt;familyname locale="nb_NO"&gt;Nilsen&lt;/familyname&gt;
        &lt;affiliation locale="nb_NO"&gt;NTL University&lt;/affiliation&gt;
        &lt;country&gt;NO&lt;/country&gt;
        &lt;email&gt;[email protected]&lt;/email&gt;
      &lt;/author&gt;</newchild></front>
</article>

我怎么能对结果做

unesc

谢谢。

大风藏

xml xpath escaping xmlstarlet
3个回答
0
投票

使用

xmllint
,在文件B上创建
newchild
,并将其内容设置为文件A
中的
//author

元素
(printf '%s\n' "cd //article/front" "set <newchild></newchild>" "cd newchild" "set $(xmllint --xpath "//author" a.xml | sed -ze 's/\n/\&#xA;/g')" "save" "bye") | xmllint --shell b.xml

单行输出

/ > cd //article/front
front > set <newchild></newchild>
front > cd newchild
newchild > set <author>&#xA;        <givenname locale="en_US">Admin</givenname>&#xA;        <givenname locale="nb_NO">Admin</givenname>&#xA;        <familyname locale="en_US">Septentrio</familyname>&#xA;        <email>[email protected]</email>&#xA;      </author>&#xA;<author include_in_browse="true" user_group_ref="Forfatter" seq="0" id="13075">&#xA;        <givenname locale="nb_NO">Peter</givenname>&#xA;        <familyname locale="nb_NO">Nilsen</familyname>&#xA;        <affiliation locale="nb_NO">NTL University</affiliation>&#xA;        <country>NO</country>&#xA;        <email>[email protected]</email>&#xA;      </author>&#xA;
newchild > save
newchild > bye

sed -ze 's/\n/\&#xA;/g'
用他们的 html 实体替换普通的新行,以避免将它们发送到管道
xmllint --shell


0
投票

像这样:

$ xmlstarlet unesc < file.xml
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.2 20190208//EN" "JATS-archivearticle1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="1.2" article-type="other">
  <front>
  <newchild><author>
        <givenname locale="en_US">Admin</givenname>
        <givenname locale="nb_NO">Admin</givenname>
        <familyname locale="en_US">Septentrio</familyname>
        <email>[email protected]</email>
        </author><author include_in_browse="true" user_group_ref="Forfatter" seq="0" id="13075">
        <givenname locale="nb_NO">Peter</givenname>
        <familyname locale="nb_NO">Nilsen</familyname>
        <affiliation locale="nb_NO">NTL University</affiliation>
        <country>NO</country>
        <email>[email protected]</email>
      </author></newchild></front>
</article>

就地编辑:

$ xmlstarlet unesc < file.xml | sponge file

0
投票

另一种方法是使用 XSLT,例如这是一个简单的 XSLT 样式表,它使用您示例中的 XPath 表达式,我称之为

insert.xsl

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:param name="author-doc"/>
    <xsl:template match="*">
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <xsl:apply-templates/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="article/front">
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <xsl:apply-templates/>
            <newchild>
                <xsl:copy-of select="document($author-doc)/article/publication/authors/author"/>
            </newchild>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

调用它:

xmlstarlet tr insert.xsl -s author-doc=a.xml b.xml
© www.soinside.com 2019 - 2024. All rights reserved.