保存时不保留 XML 格式

问题描述 投票:0回答:3

我正在 PowerShell 中编写一个修改 XML 文件的脚本。我以前没有真正使用过 XML,所以我对此很困惑。我弄清楚了如何加载、搜索、插入元素和属性以及保存更改。我遇到的问题是,当我保存更改时,原始 XML 文件的格式不会保留。命名空间行尤其被严重破坏。对于一些其他上下文,我正在使用位于 conf 文件夹中的 Apache Tomcat web.xml 文件。

下面是原始 XML 文件的片段,其中省略了一些行,以便您了解原始格式:

<?xml version="1.0" encoding="UTF-8"?>
<!--
  Licensed to the Apache Software Foundation (ASF) under one or more
  contributor license agreements.  See the NOTICE file distributed with
  this work for additional information regarding copyright ownership.
  The ASF licenses this file to You under the Apache License, Version 2.0
  (the "License"); you may not use this file except in compliance with
  the License.  You may obtain a copy of the License at

      http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License.
-->
<web-app xmlns="http://xmlns.jcp.org/xml/ns/javaee"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee
                      http://xmlns.jcp.org/xml/ns/javaee/web-app_4_0.xsd"
  version="4.0">
  


  <!-- ======================== Introduction ============================== -->
  <!-- This document defines default values for *all* web applications      -->
  <!-- loaded into this instance of Tomcat.  As each application is         -->
  <!-- deployed, this file is processed, followed by the                    -->
  <!-- "/WEB-INF/web.xml" deployment descriptor from your own               -->
  <!-- applications.                                                        -->
  <!--                                                                      -->
  <!-- WARNING:  Do not configure application-specific resources here!      -->
  <!-- They should go in the "/WEB-INF/web.xml" file in your application.   -->


  <!-- ================== Built In Servlet Definitions ==================== -->


  <!-- The default servlet for all web applications, that serves static     -->
  <!-- resources.  It processes all requests that are not mapped to other   -->
  <!-- servlets with servlet mappings (defined either here or in your own   -->
  <!-- web.xml file).  This servlet supports the following initialization   -->
  <!-- parameters (default values are in square brackets):                  -->

    <servlet>
        <servlet-name>default</servlet-name>
        <servlet-class>org.apache.catalina.servlets.DefaultServlet</servlet-class>
        <init-param>
            <param-name>debug</param-name>
            <param-value>0</param-value>
        </init-param>
        <init-param>
            <param-name>listings</param-name>
            <param-value>false</param-value>
        </init-param>
        <load-on-startup>1</load-on-startup>
    </servlet>

</web-app>

我尝试了很多不同结果的事情,但没有一个令人满意。我想在元素下方插入一些元素并保存更改,保留上面代码片段中显示的原始格式。该问题与我所做的编辑无关,因为我尝试加载 XML 文件并立即保存。我发现保存格式后,格式会以某种方式被破坏,具体取决于我的尝试。

我一直在使用.NET System.Xml.XmlDocument 类来加载和保存XML 文件。我还尝试使用 XmlWriter 和 XmlWritterSettings 类。

这是我尝试过的事情和结果。

代码:

$webXml = New-Object System.Xml.XmlDocument
$xmlPath = "C:\path\to\web.xml"
$xmlDoc.Load($xmlPath)
$webXml.Save($xmlPath)

结果:

<?xml version="1.0" encoding="UTF-8"?>
<!--
  Licensed to the Apache Software Foundation (ASF) under one or more
  contributor license agreements.  See the NOTICE file distributed with
  this work for additional information regarding copyright ownership.
  The ASF licenses this file to You under the Apache License, Version 2.0
  (the "License"); you may not use this file except in compliance with
  the License.  You may obtain a copy of the License at

      http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License.
-->
<web-app xmlns="http://xmlns.jcp.org/xml/ns/javaee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee&#xD;&#xA;                      http://xmlns.jcp.org/xml/ns/javaee/web-app_4_0.xsd" version="4.0">
  <!-- ======================== Introduction ============================== -->
  <!-- This document defines default values for *all* web applications      -->
  <!-- loaded into this instance of Tomcat.  As each application is         -->
  <!-- deployed, this file is processed, followed by the                    -->
  <!-- "/WEB-INF/web.xml" deployment descriptor from your own               -->
  <!-- applications.                                                        -->
  <!--                                                                      -->
  <!-- WARNING:  Do not configure application-specific resources here!      -->
  <!-- They should go in the "/WEB-INF/web.xml" file in your application.   -->
  <!-- ================== Built In Servlet Definitions ==================== -->
  <!-- The default servlet for all web applications, that serves static     -->
  <!-- resources.  It processes all requests that are not mapped to other   -->
  <!-- servlets with servlet mappings (defined either here or in your own   -->
  <!-- web.xml file).  This servlet supports the following initialization   -->
  <!-- parameters (default values are in square brackets):                  -->
  <servlet>
    <servlet-name>default</servlet-name>
    <servlet-class>org.apache.catalina.servlets.DefaultServlet</servlet-class>
    <init-param>
      <param-name>debug</param-name>
      <param-value>0</param-value>
    </init-param>
    <init-param>
      <param-name>listings</param-name>
      <param-value>false</param-value>
    </init-param>
    <load-on-startup>1</load-on-startup>
  </servlet>
</web-app>

代码:

$webXml = New-Object System.Xml.XmlDocument
$webXml.PreserveWhitespace = $true
$xmlPath = "C:\path\to\web.xml"
$xmlDoc.Load($xmlPath)
$webXml.Save($xmlPath)

结果:

<?xml version="1.0" encoding="UTF-8"?>
<!--
  Licensed to the Apache Software Foundation (ASF) under one or more
  contributor license agreements.  See the NOTICE file distributed with
  this work for additional information regarding copyright ownership.
  The ASF licenses this file to You under the Apache License, Version 2.0
  (the "License"); you may not use this file except in compliance with
  the License.  You may obtain a copy of the License at

      http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License.
-->
<web-app xmlns="http://xmlns.jcp.org/xml/ns/javaee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee&#xD;&#xA;                      http://xmlns.jcp.org/xml/ns/javaee/web-app_4_0.xsd" version="4.0">


  <!-- ======================== Introduction ============================== -->
  <!-- This document defines default values for *all* web applications      -->
  <!-- loaded into this instance of Tomcat.  As each application is         -->
  <!-- deployed, this file is processed, followed by the                    -->
  <!-- "/WEB-INF/web.xml" deployment descriptor from your own               -->
  <!-- applications.                                                        -->
  <!--                                                                      -->
  <!-- WARNING:  Do not configure application-specific resources here!      -->
  <!-- They should go in the "/WEB-INF/web.xml" file in your application.   -->


  <!-- ================== Built In Servlet Definitions ==================== -->


  <!-- The default servlet for all web applications, that serves static     -->
  <!-- resources.  It processes all requests that are not mapped to other   -->
  <!-- servlets with servlet mappings (defined either here or in your own   -->
  <!-- web.xml file).  This servlet supports the following initialization   -->
  <!-- parameters (default values are in square brackets):                  -->
    <servlet>
        <servlet-name>default</servlet-name>
        <servlet-class>org.apache.catalina.servlets.DefaultServlet</servlet-class>
        <init-param>
            <param-name>debug</param-name>
            <param-value>0</param-value>
        </init-param>
        <init-param>
            <param-name>listings</param-name>
            <param-value>false</param-value>
        </init-param>
        <load-on-startup>1</load-on-startup>
    </servlet>

</web-app>

代码:

$xmlDoc = New-Object System.Xml.XmlDocument
$xmlPath = "C:\path\to\web.xml"
$xmlDoc.Load($xmlPath)

# Create a new instance of XmlWriterSettings and set the properties
$settings.Indent = $true
$settings.IndentChars = "`t"
$settings.NewLineChars = "`r`n"
$settings.NewLineHandling = [System.Xml.NewLineHandling]::Replace
$settings.Encoding = [System.Text.Encoding]::UTF8

# Create a new instance of XmlWriter and save the document
$writer = [System.Xml.XmlWriter]::Create($xmlPath, $settings)
$xmlDoc.Save($writer)
$writer.Flush()
$writer.Close()

结果:

<?xml version="1.0" encoding="utf-8"?>
<!--
  Licensed to the Apache Software Foundation (ASF) under one or more
  contributor license agreements.  See the NOTICE file distributed with
  this work for additional information regarding copyright ownership.
  The ASF licenses this file to You under the Apache License, Version 2.0
  (the "License"); you may not use this file except in compliance with
  the License.  You may obtain a copy of the License at

      http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License.
-->
<web-app xmlns="http://xmlns.jcp.org/xml/ns/javaee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee&#xD;&#xA;                      http://xmlns.jcp.org/xml/ns/javaee/web-app_4_0.xsd" version="4.0">
    <!-- ======================== Introduction ============================== -->
    <!-- This document defines default values for *all* web applications      -->
    <!-- loaded into this instance of Tomcat.  As each application is         -->
    <!-- deployed, this file is processed, followed by the                    -->
    <!-- "/WEB-INF/web.xml" deployment descriptor from your own               -->
    <!-- applications.                                                        -->
    <!--                                                                      -->
    <!-- WARNING:  Do not configure application-specific resources here!      -->
    <!-- They should go in the "/WEB-INF/web.xml" file in your application.   -->
    <!-- ================== Built In Servlet Definitions ==================== -->
    <!-- The default servlet for all web applications, that serves static     -->
    <!-- resources.  It processes all requests that are not mapped to other   -->
    <!-- servlets with servlet mappings (defined either here or in your own   -->
    <!-- web.xml file).  This servlet supports the following initialization   -->
    <!-- parameters (default values are in square brackets):                  -->
    <servlet>
        <servlet-name>default</servlet-name>
        <servlet-class>org.apache.catalina.servlets.DefaultServlet</servlet-class>
        <init-param>
            <param-name>debug</param-name>
            <param-value>0</param-value>
        </init-param>
        <init-param>
            <param-name>listings</param-name>
            <param-value>false</param-value>
        </init-param>
        <load-on-startup>1</load-on-startup>
    </servlet>
</web-app>

代码:

# Load the XML document
$xmlDoc = New-Object System.Xml.XmlDocument
$xmlPath = "C:\path\to\web.xml"
$xmlDoc.Load($xmlPath)

# Create an XmlWriterSettings object with specified settings
$settings = New-Object System.Xml.XmlWriterSettings
$settings.Indent = $true
$settings.IndentChars = " "
$settings.NewLineChars = [Environment]::NewLine
$settings.NewLineHandling = [System.Xml.NewLineHandling]::Replace
$settings.OmitXmlDeclaration = $true
$settings.Encoding = New-Object System.Text.UTF8Encoding($false)

# Save the XML document with the specified settings
$writer = [System.Xml.XmlWriter]::Create($xmlPath, $settings)
$xmlDoc.Save($writer)
$writer.Close()

结果:

<!--
  Licensed to the Apache Software Foundation (ASF) under one or more
  contributor license agreements.  See the NOTICE file distributed with
  this work for additional information regarding copyright ownership.
  The ASF licenses this file to You under the Apache License, Version 2.0
  (the "License"); you may not use this file except in compliance with
  the License.  You may obtain a copy of the License at

      http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License.
-->
<web-app xmlns="http://xmlns.jcp.org/xml/ns/javaee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee&#xD;&#xA;                      http://xmlns.jcp.org/xml/ns/javaee/web-app_4_0.xsd" version="4.0">
 <!-- ======================== Introduction ============================== -->
 <!-- This document defines default values for *all* web applications      -->
 <!-- loaded into this instance of Tomcat.  As each application is         -->
 <!-- deployed, this file is processed, followed by the                    -->
 <!-- "/WEB-INF/web.xml" deployment descriptor from your own               -->
 <!-- applications.                                                        -->
 <!--                                                                      -->
 <!-- WARNING:  Do not configure application-specific resources here!      -->
 <!-- They should go in the "/WEB-INF/web.xml" file in your application.   -->
 <!-- ================== Built In Servlet Definitions ==================== -->
 <!-- The default servlet for all web applications, that serves static     -->
 <!-- resources.  It processes all requests that are not mapped to other   -->
 <!-- servlets with servlet mappings (defined either here or in your own   -->
 <!-- web.xml file).  This servlet supports the following initialization   -->
 <!-- parameters (default values are in square brackets):                  -->
 <servlet>
  <servlet-name>default</servlet-name>
  <servlet-class>org.apache.catalina.servlets.DefaultServlet</servlet-class>
  <init-param>
   <param-name>debug</param-name>
   <param-value>0</param-value>
  </init-param>
  <init-param>
   <param-name>listings</param-name>
   <param-value>false</param-value>
  </init-param>
  <load-on-startup>1</load-on-startup>
 </servlet>
</web-app>

我被难住了。任何帮助将不胜感激!

.net xml powershell formatting
3个回答
2
投票

System.Xml.XmlDocument

(PowerShell 中的 
[xml])和
System.Xml.Linq.XDocument
选择加入 -
无关紧要的空白保留功能[1]

  • 适用于内部

    元素之间的空白。
  • 不适用于元素开始标记内的空白,即不适用于属性之间的空白

因此,多行开始标签如:
<web-app xmlns="http://xmlns.jcp.org/xml/ns/javaee"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee
                      http://xmlns.jcp.org/xml/ns/javaee/web-app_4_0.xsd"
  version="4.0">

总是变成单行开始标签

,其中:
  • 属性用单个空格

  • 分隔
  • literal属性values

    中的换行符转换为:
    • 使用 
      [xml]:其 转义
       形式(使用字符引用):作为 
      &#xD;&#xA;
      (如果输入文件具有 Windows 格式的 CRLF 换行符)或 
      &#xA;

      (如果它具有 Unix 格式的 LF 换行符):
    • [System.Xml.Linq.XDocument]:每个一个空格,符合 W3C XML 推荐标准

<web-app xmlns="http://xmlns.jcp.org/xml/ns/javaee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee&#xD;&#xA;                      http://xmlns.jcp.org/xml/ns/javaee/web-app_4_0.xsd" version="4.0">

在紧要关头,您可以执行自己的纯文本后处理,不用说,这是脆弱,可能是特定于文档,并且只是尝试重新创建

原始空白 - 这假设其格式已知。

也就是说,对于像您这样的特定文档格式,它可能会起作用(使用 
[xml]
 == 
System.Xml.XmlDocument

):
# Note: Be sure to use a *full path*, because .NET's working dir.
#       usually differs from PowerShell's.
$xmlPath = "C:\path\to\web.xml"

# Load the document, with insignificant whitespace preserved.
($webXml = [xml]::new()).PreserveWhitespace = $true
$webXml.Load($xmlPath)

# ... modify it

# ... and save it.
$webXml.Save($xmlPath)

# Post-processing:
# "Re-pretty" the <web-app> element.
# Note: Be sure to match the actual encoding of the file.
$nl = [Environment]::NewLine
(Get-Content -Encoding utf8 $xmlPath) |
  ForEach-Object {
    if ($_ -match '^<web-app ') {
      $_ -replace '(?<=" )', "$nl  " -replace '(&#xD;)?&#xA;', $nl
    } else {
      $_
    }
  } | 
  Set-Content -Encoding uf8 $xmlPath

注:
  • 后处理假设输入文件使用platform-native

    换行格式,并且生成的文件将使用该格式。
    • 即使这个假设不成立,那也应该通常不会出现问题;确保输入文件的原始换行格式
    • 可能的,但需要更多工作。
  • Windows PowerShell 中(与 PowerShell (Core) 7+

     不同),
    Set-Content -Encoding utf8 总是创建一个带有 BOM 的 UTF-8 文件。

    • 对于符合标准的 XML 处理器来说,这不应该是问题,但如果是,请参阅此答案,了解如何在 Windows PowerShell 中创建无 BOM 的 UTF-8。

[1]

[xml]
示例:
$xml = [xml]::new(); $xml.PreserveWhiteSpace = $true; $xml.LoadXml('<foo>    <bar/>  </foo>')

[XDocument]
示例:
$xml = [System.Xml.Linq.XDocument]::Parse('<foo>    <bar/>  </foo>', 'PreserveWhitespace')


0
投票

您想要的设置是

$settings.NewLineOnAttributes = $true

参见这个 dotnetfiddle,它在 C# 中做了同样的事情。


0
投票

经过更多实验,我选择了 mklement0 提供的解决方案并做了一些调整:

$xmlPath = "C:\path\to\web.xml"
$newLine = [Environment]::NewLine
$xmlnsPattern = '\s+xmlns\s*=\s*""\s*'

(Get-Content -Path $xmlPath -Encoding utf8NoBOM) | ForEach-Object {
    if ($_ -match '^<web-app ') {
        $_ -replace '\s(?<=" )', "$newLine  " `
           -replace '(&#xD;)?&#xA;', $newLine `
           -replace 'version="4.0">', "version=`"4.0`">$newLine"
    } elseif ($_ -match $xmlnsPattern) {
        $_ -replace $xmlnsPattern, ''
    } else {
        $_
    }
} | Set-Content -Path $xmlPath -Encoding utf8NoBOM -Force

除了一些细微的外观变化之外,我所做的调整还删除了后处理后留下的单个空白字符。在开始

<web-app>
标记后插入新行,以在插入新元素节点后保持易读性。最后的更改删除了打开新元素标签后出现的烦人的
xmlns=""
。出现这种情况是因为我使用
ImportNode
方法从 XML 片段添加新元素节点。 AFAIK,无法在导入的片段中包含文档名称空间。 (如果有的话,我想知道!)我可以在片段中包含名称空间信息,但它很丑陋,而且代码行更多。

快速说明:

utf8NoBOM
Get-Content
编码在 PS 5.1 中不可用。
    

© www.soinside.com 2019 - 2024. All rights reserved.