C# XmlSerializer - 输出控制新行的多个 xml 片段

问题描述 投票:0回答:1

我希望能够编写没有命名空间、没有 XML 前导码等的 xml 片段。

XmlSerializer.Serialize()
在序列化到通用输出流时会产生缩进输出,但它使用“ “对于行结尾,我找不到如何配置它。

您可以序列化为

XmlWriter
,可以对其进行详细配置,但该配置似乎仅在您输出完整的“有效”xml 时才起作用。

我创建了一个像这样的

XmlWriter

XmlWriter xw = XmlWriter.Create(System.Console.Out, new XmlWriterSettings()
{
    ConformanceLevel = ConformanceLevel.Fragment,
    NamespaceHandling = NamespaceHandling.Default,
    NewLineChars = "\n",
    Encoding = new UTF8Encoding(encoderShouldEmitUTF8Identifier: false), // supress BOM
    Indent = true,
    NewLineHandling = NewLineHandling.Replace,
    OmitXmlDeclaration = true,
    WriteEndDocumentOnClose = false,
    CloseOutput = false,
    CheckCharacters = false,
    NewLineOnAttributes = false,
});

MVE

https://dotnetfiddle.net/qGLIlL 它确实允许我序列化多个对象,而无需为每个对象创建一个新的

XmlWriter
。但没有缩进。

<flyingMonkey name="Koko"><limbs><limb name="leg" /><limb name="arm" /><limb name="tail" /><limb name="wing" /></limbs></flyingMonkey>

<flyingMonkey name="New Name"><limbs><limb name="leg" /><limb name="arm" /><limb name="tail" /><limb name="wing" /></limbs></flyingMonkey>
public class Program {
public static void Main()
    {
        XmlWriter xw = XmlWriter.Create(System.Console.Out, new XmlWriterSettings()
        {
            ConformanceLevel = ConformanceLevel.Fragment,
            NamespaceHandling = NamespaceHandling.Default,
            NewLineChars = "\n",
            Encoding = new UTF8Encoding(encoderShouldEmitUTF8Identifier: false), // supress BOM
            Indent = true,
            NewLineHandling = NewLineHandling.Replace,
            OmitXmlDeclaration = true,
            WriteEndDocumentOnClose = false,
            CloseOutput = false,
            CheckCharacters = false,
            NewLineOnAttributes = false,
        });
        var noNamespace = new XmlSerializerNamespaces(new[] { XmlQualifiedName.Empty });

        // without this line I get an error:
        //   "WriteStartDocument cannot be called on writers created with ConformanceLevel.Fragment."
        xw.WriteWhitespace("");

        FlyingMonkey monkey = FlyingMonkey.Create();
        XmlSerializer ser = new XmlSerializer(typeof(FlyingMonkey), defaultNamespace: null);
        ser.Serialize(xw, monkey, noNamespace);
        xw.WriteWhitespace("\n\n");
        monkey.name = "New Name";
        ser.Serialize(xw, monkey, noNamespace);
    }
}

[System.Xml.Serialization.XmlTypeAttribute(TypeName = "flyingMonkey", Namespace=null)]
public class FlyingMonkey
{
    [System.Xml.Serialization.XmlAttributeAttribute()]
    public string name;

    public Limb[] limbs;

    public static FlyingMonkey Create() =>
        new FlyingMonkey()
        {
            name = "Koko",
            limbs = new Limb[]
            {
                new Limb() { name = "leg" }, new Limb() { name = "arm" },
                new Limb() { name = "tail" }, new Limb() { name = "wing" },
            }
        };
}

[System.Xml.Serialization.XmlTypeAttribute(TypeName = "limb", Namespace=null)]
public class Limb
{
    [System.Xml.Serialization.XmlAttributeAttribute()]
    public string name;
}

有什么作用:

XmlWriter xw = XmlWriter.Create(System.Console.Out, new XmlWriterSettings()
{
    ConformanceLevel = ConformanceLevel.Auto,
    NamespaceHandling = NamespaceHandling.Default,
    NewLineChars = "\n",
    Encoding = new UTF8Encoding(encoderShouldEmitUTF8Identifier: false), // supress BOM
    Indent = true,
    NewLineHandling = NewLineHandling.Replace,
    OmitXmlDeclaration = true,
    WriteEndDocumentOnClose = false,
    CloseOutput = false,
    CheckCharacters = false,
    NewLineOnAttributes = false,
});
var noNamespace = new XmlSerializerNamespaces(new[] { XmlQualifiedName.Empty });

// This is not needed anymore. If I invoke that, it will kill indentation for some reason.
// xw.WriteWhitespace("");

FlyingMonkey monkey = FlyingMonkey.Create();
XmlSerializer ser = new XmlSerializer(typeof(FlyingMonkey), defaultNamespace: null);
ser.Serialize(xw, monkey, noNamespace);
// xw.WriteWhitespace("\n\n");
// monkey.name = "New Name";
// ser.Serialize(xw, monkey, noNamespace); // this second serialization throws InvalidOperationException

它确实以正确的行结尾进行打印,但不允许您将另一个对象写入同一个 XmlWriter 实例。

<flyingMonkey name="Koko">
  <limbs>
    <limb name="leg" />
    <limb name="arm" />
    <limb name="tail" />
    <limb name="wing" />
  </limbs>
</flyingMonkey>
c# xml serialization xml-serialization xmlserializer
1个回答
0
投票

您的基本问题是

XmlSerializer
旨在序列化为单个格式良好的 XML 文档,而不是序列化的片段。如果您尝试使用
ConformanceLevel.Fragment
序列化为 XML,您将收到异常

无法在使用 ConformanceLevel.Fragment 创建的编写器上调用 WriteStartDocument

.net XmlSerialize 的这个答案中抛出“无法在使用 ConformanceLevel.Fragment 创建的编写器上调用 WriteStartDocument”Wim Reymen 确定了两种解决方法:

  • 在第一次调用序列化之前调用

    XmlWriter.WriteWhitespace("")

    不幸的是,正如您所注意到的,这会禁用缩进。发生这种情况的原因是

    XmlWriter
    禁用混合内容的缩进,并且调用写入空格会触发
    XmlEncodedRawTextWriterIndent
    的混合内容检测(演示此处)。

  • 在第一次序列化之前调用

    XmlWriter.WriteComment("")

    虽然这不会禁用缩进,但当然会写出您不想要的注释。

那么您有哪些解决方法可供选择?

首先,正如您所注意到的,您可以使用

XmlWriter
 为每个项目创建一个单独的 
CloseOutput = false
。在您写的评论中,这样做增加了很多开销,我需要编写多达 100k 个元素,因此希望重用编写器实例,但我建议您进行配置文件以确保,因为与此相比,此解决方法非常非常简单到替代方案。

假设您正在写入

Stream
,您可以创建一个像这样的扩展方法:

public static partial class XmlExtensions
{
    static Encoding Utf8EncodingNoBom { get; } = new UTF8Encoding(encoderShouldEmitUTF8Identifier: false);
    
    public static void SerializeFragmentsToXml<T>(this IEnumerable<T> enumerable, Stream stream, XmlSerializer? serializer = null, XmlSerializerNamespaces? ns = null)
    {
        var newLine = "\n";
        var newLineBytes = Utf8EncodingNoBom.GetBytes(newLine+newLine);
        
        var settings = new XmlWriterSettings()
        {
            NamespaceHandling = NamespaceHandling.Default,
            NewLineChars = newLine,
            Encoding = Utf8EncodingNoBom, // supress BOM
            Indent = true,
            NewLineHandling = NewLineHandling.Replace,
            OmitXmlDeclaration = true,
            WriteEndDocumentOnClose = false,
            CloseOutput = false, // Required to prevent the stream from being closed between items
            CheckCharacters = false,
            NewLineOnAttributes = false,
        };
        
        serializer ??= new XmlSerializer(typeof(T));
        ns ??= new XmlSerializerNamespaces(new[] { XmlQualifiedName.Empty });

        bool first = true;
        foreach (var item in enumerable)
        {
            if (!first)
                stream.Write(newLineBytes);
            using (var xmlWriter = XmlWriter.Create(stream, settings))
                serializer.Serialize(xmlWriter, item, ns);
            first = false;
        }
    }
}

并使用它,例如如下:

var items = new [] { "Koko", "POCO", "Loco" }.Select(n => FlyingMonkey.Create(n));

using var stream = new MemoryStream(); // Replace with some FileStream when serializing to disk

var serializer = new XmlSerializer(typeof(FlyingMonkey), defaultNamespace: null);
items.SerializeFragmentsToXml(stream, serializer : serializer);

演示小提琴#1 这里

或者,如果您出于性能原因确实需要重用

XmlWriter
,则需要调用
XmlWriter.WriteComment()
以防止
XmlSerializer
出现异常,并在之后编辑掉不需要的注释,例如通过一些
TextWriter
decorator 可以在动态写入时删除它们。

以下扩展方法似乎可以做到这一点:

public static partial class XmlExtensions
{
    const string FirstCommentText = "first";
    const string FirstComment = $"<!--{FirstCommentText}-->";
    const string SubsequentCommentText = "subsequent";
    const string SubsequentComment = $"<!--{SubsequentCommentText}-->";
    
    static Encoding Utf8EncodingNoBom { get; } = new UTF8Encoding(encoderShouldEmitUTF8Identifier: false);
    
    public static void SerializeFragmentsToXml<T>(this IEnumerable<T> enumerable, Stream stream, XmlSerializer? serializer = null, XmlSerializerNamespaces? ns = null)
    {
        string newLine = "\n";
        
        var settings = new XmlWriterSettings()
        {
            ConformanceLevel = ConformanceLevel.Fragment,
            NamespaceHandling = NamespaceHandling.Default,
            NewLineChars = newLine,
            Encoding = Utf8EncodingNoBom, // supress BOM
            Indent = true,
            NewLineHandling = NewLineHandling.Replace,
            OmitXmlDeclaration = true,
            WriteEndDocumentOnClose = false,
            CloseOutput = false, 
            CheckCharacters = false,
            NewLineOnAttributes = false,
        };
        
        serializer ??= new XmlSerializer(typeof(T));
        ns ??= new XmlSerializerNamespaces(new[] { XmlQualifiedName.Empty });

        using var innerTextWriter = new StreamWriter(stream, encoding : Utf8EncodingNoBom, leaveOpen  : true) { NewLine = newLine };
        using var textWriter = new FakeCommentRemovingTextWriter(innerTextWriter, new(FirstComment, ""), new(SubsequentComment, newLine)) { NewLine = newLine };
        using var xmlWriter = XmlWriter.Create(textWriter, settings);

        bool first = true;
        foreach (var item in enumerable)
        {
            xmlWriter.WriteComment(first ? FirstCommentText : SubsequentCommentText);
            serializer.Serialize(xmlWriter, item, ns);
            // XmlWriter buffers its output, so Flush() is required  to ensure that the fake comments are not split across calls to Write().
            xmlWriter.Flush(); 
            first = false;
        }
    }
    
    private class FakeCommentRemovingTextWriter : TextWriterDecorator
    {
        readonly KeyValuePair<string, string> [] replacements;
        
        public FakeCommentRemovingTextWriter(TextWriter baseWriter, params KeyValuePair<string, string> [] replacements) : base(baseWriter, true) => this.replacements = replacements;
        
        public override void Write(ReadOnlySpan<char> buffer)
        {
            foreach (var replacement in replacements)
            {
                int index;
                if ((index = buffer.IndexOf(replacement.Key)) >= 0 && buffer.Slice(0, index).IsWhiteSpace())
                {
                    if (index > 0)
                        base.Write(buffer.Slice(0, index));
                    buffer = buffer.Slice(index).Slice(replacement.Key.Length);
                    if (buffer.StartsWith(NewLine))
                        buffer = buffer.Slice(NewLine.Length);
                    if (!string.IsNullOrEmpty(replacement.Value))
                        base.Write(replacement.Value);
                }
            }
            base.Write(buffer);
        }
    }
}

public class TextWriterDecorator : TextWriter
{
    // Override the same methods that are overridden in https://github.com/dotnet/runtime/blob/main/src/libraries/System.Private.CoreLib/src/System/IO/StringWriter.cs.
    TextWriter? baseWriter; // null when disposed
    readonly bool disposeBase;
    readonly Encoding baseEncoding;

    public TextWriterDecorator(TextWriter baseWriter, bool disposeBase = true) => 
        (this.baseWriter, this.disposeBase, this.baseEncoding) = (baseWriter ?? throw new ArgumentNullException(nameof(baseWriter)), disposeBase, baseWriter.Encoding);

    protected TextWriter BaseWriter => baseWriter == null ? throw new ObjectDisposedException(GetType().Name) : baseWriter;
    public override Encoding Encoding => baseEncoding;
    public override IFormatProvider FormatProvider => baseWriter?.FormatProvider ?? base.FormatProvider;
    [AllowNull] public override string NewLine 
    { 
        get => baseWriter?.NewLine ?? base.NewLine; 
        set
        {   
            if (baseWriter != null)
                baseWriter.NewLine = value;
            base.NewLine = value;
        }
    }

    public override void Flush() => BaseWriter.Flush();
    public sealed override void Close() => Dispose(true);
    public override void Write(char value) => BaseWriter.Write(value);
    public sealed override void Write(char[] buffer, int index, int count) => this.Write(buffer.AsSpan(index, count));
    public override void Write(ReadOnlySpan<char> buffer) => BaseWriter.Write(buffer);
    public sealed override void Write(string? value) => Write((value ?? string.Empty).AsSpan());

    public override Task WriteAsync(char value) => BaseWriter.WriteAsync(value);
    public sealed override Task WriteAsync(string? value) => WriteAsync(value.AsMemory());
    public sealed override Task WriteAsync(char[] buffer, int index, int count) => WriteAsync(buffer.AsMemory(index, count));
    public override Task WriteAsync(ReadOnlyMemory<char> buffer, CancellationToken cancellationToken = default) => BaseWriter.WriteAsync(buffer, cancellationToken);
    //public virtual Task WriteAsync(StringBuilder? value, CancellationToken cancellationToken = default) - no need to override

    public override Task WriteLineAsync(char value) => BaseWriter.WriteLineAsync(value);
    public sealed override Task WriteLineAsync(string? value) => WriteLineAsync(value.AsMemory());
    public override Task WriteLineAsync(StringBuilder? value, CancellationToken cancellationToken = default) => BaseWriter.WriteLineAsync(value, cancellationToken);
    public sealed override Task WriteLineAsync(char[] buffer, int index, int count) => WriteLineAsync(buffer.AsMemory(index, count));
    public override Task WriteLineAsync(ReadOnlyMemory<char> buffer, CancellationToken cancellationToken = default) => BaseWriter.WriteLineAsync(buffer, cancellationToken);
    
    public override Task FlushAsync() => BaseWriter.FlushAsync();
    public override Task FlushAsync(CancellationToken cancellationToken) => BaseWriter.FlushAsync(cancellationToken);

    protected override void Dispose(bool disposing)
    {
        try
        {
            if (disposing)
            {
                if (Interlocked.Exchange(ref this.baseWriter, null) is {} writer)
                    if (disposeBase)
                        writer.Dispose();
                    else
                        writer.Flush();
            }
        }
        finally
        {
            base.Dispose(disposing);
        }
    }

    public override async ValueTask DisposeAsync()
    {
        try
        {
            if (Interlocked.Exchange(ref this.baseWriter, null) is {} writer)
                if (disposeBase)
                    await writer.DisposeAsync().ConfigureAwait(false);
                else
                    await writer.FlushAsync().ConfigureAwait(false);
        }
        finally
        {
            await base.DisposeAsync().ConfigureAwait(false);
        }
    }
    
    public override string ToString() => string.Format("{0}: {1}", GetType().Name, baseWriter?.ToString() ?? "disposed");
}

但说实话,我怀疑这是否值得这么麻烦。演示小提琴 #2 这里.

无论哪种方法,输出看起来都是这样的

<flyingMonkey name="Koko">
  <limbs>
    <limb name="leg" />
    <limb name="arm" />
    <limb name="tail" />
    <limb name="wing" />
  </limbs>
</flyingMonkey>

<flyingMonkey name="POCO">
  <limbs>
    <limb name="leg" />
    <limb name="arm" />
    <limb name="tail" />
    <limb name="wing" />
  </limbs>
</flyingMonkey>

<flyingMonkey name="Loco">
  <limbs>
    <limb name="leg" />
    <limb name="arm" />
    <limb name="tail" />
    <limb name="wing" />
  </limbs>
</flyingMonkey>
© www.soinside.com 2019 - 2024. All rights reserved.