我希望能够编写没有命名空间、没有 XML 前导码等的 xml 片段。
XmlSerializer.Serialize()
在序列化到通用输出流时会产生缩进输出,但它使用“
“对于行结尾,我找不到如何配置它。
您可以序列化为
XmlWriter
,可以对其进行详细配置,但该配置似乎仅在您输出完整的“有效”xml 时才起作用。
我创建了一个像这样的
XmlWriter
:
XmlWriter xw = XmlWriter.Create(System.Console.Out, new XmlWriterSettings()
{
ConformanceLevel = ConformanceLevel.Fragment,
NamespaceHandling = NamespaceHandling.Default,
NewLineChars = "\n",
Encoding = new UTF8Encoding(encoderShouldEmitUTF8Identifier: false), // supress BOM
Indent = true,
NewLineHandling = NewLineHandling.Replace,
OmitXmlDeclaration = true,
WriteEndDocumentOnClose = false,
CloseOutput = false,
CheckCharacters = false,
NewLineOnAttributes = false,
});
https://dotnetfiddle.net/qGLIlL 它确实允许我序列化多个对象,而无需为每个对象创建一个新的
XmlWriter
。但没有缩进。
<flyingMonkey name="Koko"><limbs><limb name="leg" /><limb name="arm" /><limb name="tail" /><limb name="wing" /></limbs></flyingMonkey>
<flyingMonkey name="New Name"><limbs><limb name="leg" /><limb name="arm" /><limb name="tail" /><limb name="wing" /></limbs></flyingMonkey>
public class Program {
public static void Main()
{
XmlWriter xw = XmlWriter.Create(System.Console.Out, new XmlWriterSettings()
{
ConformanceLevel = ConformanceLevel.Fragment,
NamespaceHandling = NamespaceHandling.Default,
NewLineChars = "\n",
Encoding = new UTF8Encoding(encoderShouldEmitUTF8Identifier: false), // supress BOM
Indent = true,
NewLineHandling = NewLineHandling.Replace,
OmitXmlDeclaration = true,
WriteEndDocumentOnClose = false,
CloseOutput = false,
CheckCharacters = false,
NewLineOnAttributes = false,
});
var noNamespace = new XmlSerializerNamespaces(new[] { XmlQualifiedName.Empty });
// without this line I get an error:
// "WriteStartDocument cannot be called on writers created with ConformanceLevel.Fragment."
xw.WriteWhitespace("");
FlyingMonkey monkey = FlyingMonkey.Create();
XmlSerializer ser = new XmlSerializer(typeof(FlyingMonkey), defaultNamespace: null);
ser.Serialize(xw, monkey, noNamespace);
xw.WriteWhitespace("\n\n");
monkey.name = "New Name";
ser.Serialize(xw, monkey, noNamespace);
}
}
[System.Xml.Serialization.XmlTypeAttribute(TypeName = "flyingMonkey", Namespace=null)]
public class FlyingMonkey
{
[System.Xml.Serialization.XmlAttributeAttribute()]
public string name;
public Limb[] limbs;
public static FlyingMonkey Create() =>
new FlyingMonkey()
{
name = "Koko",
limbs = new Limb[]
{
new Limb() { name = "leg" }, new Limb() { name = "arm" },
new Limb() { name = "tail" }, new Limb() { name = "wing" },
}
};
}
[System.Xml.Serialization.XmlTypeAttribute(TypeName = "limb", Namespace=null)]
public class Limb
{
[System.Xml.Serialization.XmlAttributeAttribute()]
public string name;
}
XmlWriter xw = XmlWriter.Create(System.Console.Out, new XmlWriterSettings()
{
ConformanceLevel = ConformanceLevel.Auto,
NamespaceHandling = NamespaceHandling.Default,
NewLineChars = "\n",
Encoding = new UTF8Encoding(encoderShouldEmitUTF8Identifier: false), // supress BOM
Indent = true,
NewLineHandling = NewLineHandling.Replace,
OmitXmlDeclaration = true,
WriteEndDocumentOnClose = false,
CloseOutput = false,
CheckCharacters = false,
NewLineOnAttributes = false,
});
var noNamespace = new XmlSerializerNamespaces(new[] { XmlQualifiedName.Empty });
// This is not needed anymore. If I invoke that, it will kill indentation for some reason.
// xw.WriteWhitespace("");
FlyingMonkey monkey = FlyingMonkey.Create();
XmlSerializer ser = new XmlSerializer(typeof(FlyingMonkey), defaultNamespace: null);
ser.Serialize(xw, monkey, noNamespace);
// xw.WriteWhitespace("\n\n");
// monkey.name = "New Name";
// ser.Serialize(xw, monkey, noNamespace); // this second serialization throws InvalidOperationException
它确实以正确的行结尾进行打印,但不允许您将另一个对象写入同一个 XmlWriter 实例。
<flyingMonkey name="Koko">
<limbs>
<limb name="leg" />
<limb name="arm" />
<limb name="tail" />
<limb name="wing" />
</limbs>
</flyingMonkey>
您的基本问题是
XmlSerializer
旨在序列化为单个格式良好的 XML 文档,而不是序列化的片段。如果您尝试使用 ConformanceLevel.Fragment
序列化为 XML,您将收到异常
无法在使用 ConformanceLevel.Fragment 创建的编写器上调用 WriteStartDocument
在 .net XmlSerialize 的这个答案中抛出“无法在使用 ConformanceLevel.Fragment 创建的编写器上调用 WriteStartDocument”,Wim Reymen 确定了两种解决方法:
在第一次调用序列化之前调用
XmlWriter.WriteWhitespace("")
。
不幸的是,正如您所注意到的,这会禁用缩进。发生这种情况的原因是
XmlWriter
禁用混合内容的缩进,并且调用写入空格会触发 XmlEncodedRawTextWriterIndent
的混合内容检测(演示此处)。
在第一次序列化之前调用
XmlWriter.WriteComment("")
。
虽然这不会禁用缩进,但当然会写出您不想要的注释。
那么您有哪些解决方法可供选择?
首先,正如您所注意到的,您可以使用
XmlWriter
为每个项目创建一个单独的
CloseOutput = false
。在您写的评论中,这样做增加了很多开销,我需要编写多达 100k 个元素,因此希望重用编写器实例,但我建议您进行配置文件以确保,因为与此相比,此解决方法非常非常简单到替代方案。
假设您正在写入
Stream
,您可以创建一个像这样的扩展方法:
public static partial class XmlExtensions
{
static Encoding Utf8EncodingNoBom { get; } = new UTF8Encoding(encoderShouldEmitUTF8Identifier: false);
public static void SerializeFragmentsToXml<T>(this IEnumerable<T> enumerable, Stream stream, XmlSerializer? serializer = null, XmlSerializerNamespaces? ns = null)
{
var newLine = "\n";
var newLineBytes = Utf8EncodingNoBom.GetBytes(newLine+newLine);
var settings = new XmlWriterSettings()
{
NamespaceHandling = NamespaceHandling.Default,
NewLineChars = newLine,
Encoding = Utf8EncodingNoBom, // supress BOM
Indent = true,
NewLineHandling = NewLineHandling.Replace,
OmitXmlDeclaration = true,
WriteEndDocumentOnClose = false,
CloseOutput = false, // Required to prevent the stream from being closed between items
CheckCharacters = false,
NewLineOnAttributes = false,
};
serializer ??= new XmlSerializer(typeof(T));
ns ??= new XmlSerializerNamespaces(new[] { XmlQualifiedName.Empty });
bool first = true;
foreach (var item in enumerable)
{
if (!first)
stream.Write(newLineBytes);
using (var xmlWriter = XmlWriter.Create(stream, settings))
serializer.Serialize(xmlWriter, item, ns);
first = false;
}
}
}
并使用它,例如如下:
var items = new [] { "Koko", "POCO", "Loco" }.Select(n => FlyingMonkey.Create(n));
using var stream = new MemoryStream(); // Replace with some FileStream when serializing to disk
var serializer = new XmlSerializer(typeof(FlyingMonkey), defaultNamespace: null);
items.SerializeFragmentsToXml(stream, serializer : serializer);
演示小提琴#1 这里。
或者,如果您出于性能原因确实需要重用
XmlWriter
,则需要调用XmlWriter.WriteComment()
以防止XmlSerializer
出现异常,并在之后编辑掉不需要的注释,例如通过一些 TextWriter
decorator 可以在动态写入时删除它们。
以下扩展方法似乎可以做到这一点:
public static partial class XmlExtensions
{
const string FirstCommentText = "first";
const string FirstComment = $"<!--{FirstCommentText}-->";
const string SubsequentCommentText = "subsequent";
const string SubsequentComment = $"<!--{SubsequentCommentText}-->";
static Encoding Utf8EncodingNoBom { get; } = new UTF8Encoding(encoderShouldEmitUTF8Identifier: false);
public static void SerializeFragmentsToXml<T>(this IEnumerable<T> enumerable, Stream stream, XmlSerializer? serializer = null, XmlSerializerNamespaces? ns = null)
{
string newLine = "\n";
var settings = new XmlWriterSettings()
{
ConformanceLevel = ConformanceLevel.Fragment,
NamespaceHandling = NamespaceHandling.Default,
NewLineChars = newLine,
Encoding = Utf8EncodingNoBom, // supress BOM
Indent = true,
NewLineHandling = NewLineHandling.Replace,
OmitXmlDeclaration = true,
WriteEndDocumentOnClose = false,
CloseOutput = false,
CheckCharacters = false,
NewLineOnAttributes = false,
};
serializer ??= new XmlSerializer(typeof(T));
ns ??= new XmlSerializerNamespaces(new[] { XmlQualifiedName.Empty });
using var innerTextWriter = new StreamWriter(stream, encoding : Utf8EncodingNoBom, leaveOpen : true) { NewLine = newLine };
using var textWriter = new FakeCommentRemovingTextWriter(innerTextWriter, new(FirstComment, ""), new(SubsequentComment, newLine)) { NewLine = newLine };
using var xmlWriter = XmlWriter.Create(textWriter, settings);
bool first = true;
foreach (var item in enumerable)
{
xmlWriter.WriteComment(first ? FirstCommentText : SubsequentCommentText);
serializer.Serialize(xmlWriter, item, ns);
// XmlWriter buffers its output, so Flush() is required to ensure that the fake comments are not split across calls to Write().
xmlWriter.Flush();
first = false;
}
}
private class FakeCommentRemovingTextWriter : TextWriterDecorator
{
readonly KeyValuePair<string, string> [] replacements;
public FakeCommentRemovingTextWriter(TextWriter baseWriter, params KeyValuePair<string, string> [] replacements) : base(baseWriter, true) => this.replacements = replacements;
public override void Write(ReadOnlySpan<char> buffer)
{
foreach (var replacement in replacements)
{
int index;
if ((index = buffer.IndexOf(replacement.Key)) >= 0 && buffer.Slice(0, index).IsWhiteSpace())
{
if (index > 0)
base.Write(buffer.Slice(0, index));
buffer = buffer.Slice(index).Slice(replacement.Key.Length);
if (buffer.StartsWith(NewLine))
buffer = buffer.Slice(NewLine.Length);
if (!string.IsNullOrEmpty(replacement.Value))
base.Write(replacement.Value);
}
}
base.Write(buffer);
}
}
}
public class TextWriterDecorator : TextWriter
{
// Override the same methods that are overridden in https://github.com/dotnet/runtime/blob/main/src/libraries/System.Private.CoreLib/src/System/IO/StringWriter.cs.
TextWriter? baseWriter; // null when disposed
readonly bool disposeBase;
readonly Encoding baseEncoding;
public TextWriterDecorator(TextWriter baseWriter, bool disposeBase = true) =>
(this.baseWriter, this.disposeBase, this.baseEncoding) = (baseWriter ?? throw new ArgumentNullException(nameof(baseWriter)), disposeBase, baseWriter.Encoding);
protected TextWriter BaseWriter => baseWriter == null ? throw new ObjectDisposedException(GetType().Name) : baseWriter;
public override Encoding Encoding => baseEncoding;
public override IFormatProvider FormatProvider => baseWriter?.FormatProvider ?? base.FormatProvider;
[AllowNull] public override string NewLine
{
get => baseWriter?.NewLine ?? base.NewLine;
set
{
if (baseWriter != null)
baseWriter.NewLine = value;
base.NewLine = value;
}
}
public override void Flush() => BaseWriter.Flush();
public sealed override void Close() => Dispose(true);
public override void Write(char value) => BaseWriter.Write(value);
public sealed override void Write(char[] buffer, int index, int count) => this.Write(buffer.AsSpan(index, count));
public override void Write(ReadOnlySpan<char> buffer) => BaseWriter.Write(buffer);
public sealed override void Write(string? value) => Write((value ?? string.Empty).AsSpan());
public override Task WriteAsync(char value) => BaseWriter.WriteAsync(value);
public sealed override Task WriteAsync(string? value) => WriteAsync(value.AsMemory());
public sealed override Task WriteAsync(char[] buffer, int index, int count) => WriteAsync(buffer.AsMemory(index, count));
public override Task WriteAsync(ReadOnlyMemory<char> buffer, CancellationToken cancellationToken = default) => BaseWriter.WriteAsync(buffer, cancellationToken);
//public virtual Task WriteAsync(StringBuilder? value, CancellationToken cancellationToken = default) - no need to override
public override Task WriteLineAsync(char value) => BaseWriter.WriteLineAsync(value);
public sealed override Task WriteLineAsync(string? value) => WriteLineAsync(value.AsMemory());
public override Task WriteLineAsync(StringBuilder? value, CancellationToken cancellationToken = default) => BaseWriter.WriteLineAsync(value, cancellationToken);
public sealed override Task WriteLineAsync(char[] buffer, int index, int count) => WriteLineAsync(buffer.AsMemory(index, count));
public override Task WriteLineAsync(ReadOnlyMemory<char> buffer, CancellationToken cancellationToken = default) => BaseWriter.WriteLineAsync(buffer, cancellationToken);
public override Task FlushAsync() => BaseWriter.FlushAsync();
public override Task FlushAsync(CancellationToken cancellationToken) => BaseWriter.FlushAsync(cancellationToken);
protected override void Dispose(bool disposing)
{
try
{
if (disposing)
{
if (Interlocked.Exchange(ref this.baseWriter, null) is {} writer)
if (disposeBase)
writer.Dispose();
else
writer.Flush();
}
}
finally
{
base.Dispose(disposing);
}
}
public override async ValueTask DisposeAsync()
{
try
{
if (Interlocked.Exchange(ref this.baseWriter, null) is {} writer)
if (disposeBase)
await writer.DisposeAsync().ConfigureAwait(false);
else
await writer.FlushAsync().ConfigureAwait(false);
}
finally
{
await base.DisposeAsync().ConfigureAwait(false);
}
}
public override string ToString() => string.Format("{0}: {1}", GetType().Name, baseWriter?.ToString() ?? "disposed");
}
但说实话,我怀疑这是否值得这么麻烦。演示小提琴 #2 这里.
无论哪种方法,输出看起来都是这样的
<flyingMonkey name="Koko">
<limbs>
<limb name="leg" />
<limb name="arm" />
<limb name="tail" />
<limb name="wing" />
</limbs>
</flyingMonkey>
<flyingMonkey name="POCO">
<limbs>
<limb name="leg" />
<limb name="arm" />
<limb name="tail" />
<limb name="wing" />
</limbs>
</flyingMonkey>
<flyingMonkey name="Loco">
<limbs>
<limb name="leg" />
<limb name="arm" />
<limb name="tail" />
<limb name="wing" />
</limbs>
</flyingMonkey>