如何反序列化具有不同元素名称的XML?

问题描述 投票:0回答:1

我想将许多 xml 反序列化为 C# 对象,但 xml 的同一元素具有不同的名称。

    private FileAsXML GetDataObject(string xml)
    {
        FileAsXML data = new FileAsXML();
        XmlSerializer serializer = new XmlSerializer(typeof(FileAsXML));

        serializer.UnknownElement += (sender, e) =>
        {
            Console.WriteLine($"Unknown element found: {e.Element.Name}");
        };

        using (TextReader reader = new StringReader(xml))
        {
            data = (FileAsXML)serializer.Deserialize(reader);
        }
        return data;
    }

例如,我有第一个 xml:

<MainFile>
 <11Test>
  <1Name>Anthony</1Name>
  <HelloFriend11>...</HelloFriend11>
 </11Test>
</MainFile>

第二个 xml:

<MainFile>
 <11Test11>
  <Name1>Anthony</Name1>
  <1HelloFriend>...</1HelloFriend>
 </11Test11>
</MainFile>

我的 FileAsXML 对象:

[XmlRoot(ElementName = "MainFile", IsNullable = true)]
public class FileAsXML
{
    [XmlElement(ElementName = "Test", IsNullable = true)]
    public Test Test{ get; set; }
}

[XmlRoot(ElementName = "Test", IsNullable = true)]
public class Test
{
    [XmlElement(ElementName = "Name", IsNullable = true)]
    public string Name{ get; set; }

    [XmlElement(ElementName = "HelloFriend", IsNullable = true)]
    public HelloFriend HelloFriend{ get; set; }
    
}

所有 xml 的结构都是相同的,但元素名称不同,但元素名称始终具有 const 部分。

它们的元素名称中都带有“Test”。如何调用XmlSerializer来捕获元素?

在我的对象中,我尝试不同的解决方案,例如:

[XmlElement(ElementName = "test", IsNullable = true)]
public string test { get; set; }

[XmlAnyElement(Name= "Test"]
public string test { get; set; }

[XmlElement(ElementName = "*Test*", IsNullable = true)]
public string test { get; set; }

我有很多元素,它也可以是序列化所需的对象。

也许有人知道解决方案?

解决方案 这种做法似乎是错误的。对于我的情况,我应该使用“xml linq c# 谢谢@dbc @user246821 @Alexander Petrov

c# xml xmlserializer
1个回答
0
投票

您无法使

XmlSerializer
将名称与某些正则表达式匹配的 XML 元素映射到固定元素,因为它尚未实现。
XmlSerializer
旨在使用一些固定的 XSD 模式将 C# 对象序列化和反序列化为 XML,并且 XSD 1.0 模式不支持在元素名称中使用正则表达式[1]

那么,您有哪些解决方法可供选择?

首先,您可以将 XML 加载到 LINQ to XML

XDocument
中,并手动将其映射到您的
MainFile
模型。为了方便起见,首先引入以下扩展方法:

public static partial class XNodeExtensions
{
    public static IEnumerable<XElement> Elements(this XContainer container, Regex localNameRegex) => 
        container.Elements(localNameRegex, XNamespace.None);

    public static IEnumerable<XElement> Elements(this XContainer container, Regex localNameRegex, XNamespace @namespace)
        => container.Elements().Where(e => @namespace == e.Name.Namespace && localNameRegex.IsMatch(e.Name.LocalName));
}

现在你可以做类似的事情:

[XmlRoot("MainFile")]
public class FileAsXML
{
    public Test Test { get; set; }
}

public class Test 
{
    public string Name { get; set; }
    public string HelloFriend { get; set; }
}

public static class FileAsXMLFactory
{
    const RegexOptions Options = RegexOptions.CultureInvariant | RegexOptions.Singleline;
    
    static XNamespace MainFileNamespace { get; } = "";
    
    public static FileAsXML Load(string fileName) => XDocument.Load(fileName).FromXDocument();

    public static FileAsXML FromXDocument(this XDocument doc) =>
        doc?.Root?.Name == MainFileNamespace + "MainFile"
        ? new FileAsXML
        {
            Test = doc.Root.Elements(new Regex("^.*Test[0-9]*$", Options), MainFileNamespace)
                .Select(test =>
                        new Test
                        {
                            HelloFriend = test.Elements(new Regex("^.*HelloFriend[0-9]*$", Options), MainFileNamespace).Select(h => h.Value).SingleOrDefault(),
                            Name = test.Elements(new Regex("^.*Name[0-9]*$", Options), MainFileNamespace).Select(n => n.Value).SingleOrDefault(),
                        }).SingleOrDefault(),
        }
        : throw new ArgumentException($"Unexpected root element \"{doc?.Root?.Name}\"");
}

备注:

  • 如果您的数据模型没有数十个属性,这看起来是最简单的方法。

  • 您可能希望缓存并重用您的

    Regex
    对象以提高性能。

  • XML 标准不允许元素名称以数字作为第一个字符:

    [定义:名称是具有一组受限制的初始字符的 Nmtoken。] 名称中不允许使用的初始字符包括数字、变音符号、句号和连字符。

演示小提琴#1 这里

其次,如果您的 XML 文件很大并且您不想将整个文件加载到

XDocument
中,您可以创建一些自定义
XmlReader
decorator 来动态重命名元素。阅读。

为此,请定义以下

XmlReader
子类:

public class ElementRenamingXmlReaderDecorator : XmlReaderDecorator
{
    readonly (Regex regex, string replacement) [] maps;
    string? cachedElementLocalName = null;

    public ElementRenamingXmlReaderDecorator(XmlReader baseReader, (Regex regex, string replacement) [] maps) : base(baseReader) => 
        this.maps = maps ?? throw new ArgumentNullException(nameof(maps));

    string GetRenamedLocalName(string name)
    {
        foreach (var map in maps)
            if (map.regex.IsMatch(name))
                return map.replacement;
        return name;
    }

    public override string LocalName => NodeType == XmlNodeType.Element 
        ? cachedElementLocalName ?? (cachedElementLocalName = NameTable.Add(GetRenamedLocalName(base.LocalName)))
        : base.LocalName;
    
    public override string Name =>
        NodeType switch
        {
            XmlNodeType.Element when Prefix.Length == 0 => LocalName,
            XmlNodeType.Element => NameTable.Add(string.Concat(Prefix, ":", LocalName)),
            _ => base.Name,
        };
        
    public override bool Read() { cachedElementLocalName = null; return base.Read(); }
    public override void Skip() { cachedElementLocalName = null; base.Skip(); }
    protected override void Dispose(bool disposing) { cachedElementLocalName = null; base.Dispose(disposing); }
}

public class XmlReaderDecorator : XmlReader, IXmlLineInfo, IXmlNamespaceResolver
{
    private XmlReader? baseReader;

    public XmlReaderDecorator(XmlReader baseReader) => this.baseReader = baseReader ?? throw new ArgumentNullException(nameof(baseReader));

    protected XmlReader BaseReader => baseReader ?? throw new ObjectDisposedException(this.GetType().Name);

    public override XmlNodeType NodeType => BaseReader.NodeType;
    public override int Depth => BaseReader.Depth; 
    public override bool EOF => BaseReader.EOF;
    public override ReadState ReadState => BaseReader.ReadState;
    public override string Name => BaseReader.Name;
    public override string LocalName => BaseReader.LocalName;
    public override string NamespaceURI => BaseReader.NamespaceURI;
    public override string BaseURI => BaseReader.BaseURI; 
    public override string Prefix => BaseReader.Prefix; 
    public override bool HasValue => BaseReader.HasValue;
    public override string Value => BaseReader.Value; 
    public override Type ValueType => BaseReader.ValueType;
    public override bool IsEmptyElement => BaseReader.IsEmptyElement;
    public override bool IsDefault => BaseReader.IsDefault;
    public override char QuoteChar => BaseReader.QuoteChar;
    public override XmlSpace XmlSpace => BaseReader.XmlSpace;
    public override string XmlLang => BaseReader.XmlLang;
    public override bool HasAttributes => BaseReader.HasAttributes; 
    public override int AttributeCount => BaseReader.AttributeCount;
    public override bool CanResolveEntity => BaseReader.CanResolveEntity;
    public override XmlNameTable NameTable => BaseReader.NameTable;
    public override XmlReaderSettings? Settings => BaseReader.Settings;
    public override IXmlSchemaInfo? SchemaInfo => BaseReader.SchemaInfo;
    public override string this[int i] => BaseReader[i];
    public override string? this[string name] => BaseReader[name];
    public override string? this[string name, string? namespaceURI] => BaseReader[name, namespaceURI];
    public override string? GetAttribute(string name) => BaseReader.GetAttribute(name);
    public override string? GetAttribute(string name, string? namespaceURI) => BaseReader.GetAttribute(name, namespaceURI);
    public override string GetAttribute(int i) => BaseReader.GetAttribute(i);
    public override bool MoveToAttribute(string name) => BaseReader.MoveToAttribute(name);
    public override bool MoveToAttribute(string name, string? ns) => BaseReader.MoveToAttribute(name, ns);
    public override void MoveToAttribute(int i) => BaseReader.MoveToAttribute(i);
    public override bool MoveToFirstAttribute() => BaseReader.MoveToFirstAttribute();
    public override bool MoveToNextAttribute() => BaseReader.MoveToNextAttribute();
    public override bool MoveToElement() => BaseReader.MoveToElement();
    public override bool Read() => BaseReader.Read();
    public override void Skip() => BaseReader.Skip();
    public override void Close() => BaseReader.Close();
    public override string? LookupNamespace(string prefix) => BaseReader.LookupNamespace(prefix);
    public override void ResolveEntity() => BaseReader.ResolveEntity();
    public override bool ReadAttributeValue() => BaseReader.ReadAttributeValue();
    public virtual bool HasLineInfo() => BaseReader is IXmlLineInfo info ? info.HasLineInfo() : false;
    public virtual int LineNumber => BaseReader is IXmlLineInfo info ? info.LineNumber : 0;
    public virtual int LinePosition => BaseReader is IXmlLineInfo info ? info.LinePosition : 0;

    string? IXmlNamespaceResolver.LookupPrefix(string namespaceName) => BaseReader is IXmlNamespaceResolver resolver ? resolver.LookupPrefix(namespaceName) : null;
    IDictionary<string, string> IXmlNamespaceResolver.GetNamespacesInScope(XmlNamespaceScope scope) => BaseReader is IXmlNamespaceResolver resolver ? resolver.GetNamespacesInScope(scope) : throw new NotImplementedException();

    protected override void Dispose(bool disposing)
    {
        // Do not throw an exception on multiple calls to dispose
        if (baseReader != null)
            base.Dispose(disposing);
        Interlocked.Exchange(ref this.baseReader, null)?.Dispose();
    }
    
    // TODO : ASYNC
}

现在你可以做:

public static class FileAsXMLFactory
{
    const RegexOptions Options = RegexOptions.CultureInvariant | RegexOptions.Singleline;
    
    static (Regex regex, string replacement) [] ElementNameMaps { get; } =
        [
            (new("^.*Test[0-9]*$", Options), "Test"),
            (new("^.*HelloFriend[0-9]*$", Options), "HelloFriend"),
            (new("^.*Name[0-9]*$", Options), "Name"),
        ];

    public static FileAsXML? Load(string fileName)
    {
        using var reader = XmlReader.Create(fileName);
        return Load(reader);
    }

    public static FileAsXML? Load(XmlReader reader)
    {
        using var renamingReader = new ElementRenamingXmlReaderDecorator(reader, ElementNameMaps);
        var serializer = new XmlSerializer(typeof(FileAsXML));
        return (FileAsXML?)serializer.Deserialize(renamingReader);
    }
}

备注:

  • 如果您的数据模型的一部分中有一个属性,其名称需要在其他地方替换,例如,此方法将不起作用。如果您有以下一种类型:

    public string Name { get; set; }
    

    还有其他类型:

    public string Name1 { get; set; }
    public string Name2 { get; set; }
    

演示小提琴 #2 这里

其他替代方案包括:

  • 在反序列化之前使用一些 XSLT 转换修复 XML。

  • 在数据模型中的每个类上实现

    IXmlSerializable
    ,并使用传入的
    XmlReader
    手动填充所有内容。

    老实说我不推荐它,因为这个接口在不引入任何错误的情况下实现起来非常棘手。

  • 要求 XML 提供商修改其代码,以生成具有固定元素名称的 XML,这些元素名称可以根据 XSD 1.0 架构进行验证。


[1] 要确认这在 XSD 1.0 中是不可能的,请参阅我可以在 XML 架构元素名称中使用正则表达式吗?。 XSD 1.0 是 .NET 支持的版本。

© www.soinside.com 2019 - 2024. All rights reserved.