Word使用openxml分割docx后在xxx.docx中发现不可读的内容

问题描述 投票:0回答:1

我有一个full.docx,其中包含两个数学问题,docx嵌入了一些图片和MathType方程(oleobject),我根据this拆分了doc,得到了两个文件(first.docx,second.docx),第一个.docx可以正常工作,但是第二个.docx在尝试打开它时会弹出一个警告对话框:

"Word found unreadable content in second.docx. Do you want to recover the contents of this document? If you trust the source of this document, click Yes."

单击“是”后,可以打开该文档,内容也正确,我想知道second.docx有什么问题吗?我已经使用“ Open xml sdk 2.5生产力工具”进行了检查,但没有找到原因。非常感谢您的帮助。谢谢。

[这三个文件已上传到here

显示一些代码:

        byte[] templateBytes = System.IO.File.ReadAllBytes(TEMPLATE_YANG_FILE);
        using (MemoryStream templateStream = new MemoryStream())
        {
            templateStream.Write(templateBytes, 0, (int)templateBytes.Length);

            string guidStr = Guid.NewGuid().ToString();

            using (WordprocessingDocument document = WordprocessingDocument.Open(templateStream, true))
            {
                document.ChangeDocumentType(DocumentFormat.OpenXml.WordprocessingDocumentType.Document);

                MainDocumentPart mainPart = document.MainDocumentPart;

                mainPart.Document = new Document();
                Body bd = new Body();

                foreach (DocumentFormat.OpenXml.Wordprocessing.Paragraph clonedParagrph in lst)
                {
                    bd.AppendChild<DocumentFormat.OpenXml.Wordprocessing.Paragraph>(clonedParagrph);

                    clonedParagrph.Descendants<Blip>().ToList().ForEach(blip =>
                    {
                        var newRelation = document.CopyImage(blip.Embed, this.wordDocument);
                        blip.Embed = newRelation;
                    });

                    clonedParagrph.Descendants<DocumentFormat.OpenXml.Vml.ImageData>().ToList().ForEach(imageData =>
                    {
                        var newRelation = document.CopyImage(imageData.RelationshipId, this.wordDocument);
                        imageData.RelationshipId = newRelation;
                    });
                }

                mainPart.Document.Body = bd;
                mainPart.Document.Save();
            }

            string subDocFile = System.IO.Path.Combine(this.outDir, guidStr + ".docx");
            this.subWordFileLst.Add(subDocFile);

            File.WriteAllBytes(subDocFile, templateStream.ToArray());
        }

第一个包含使用以下命令从原始docx中克隆的段落:

(DocumentFormat.OpenXml.Wordprocessing.Paragraph)p.Clone();
ms-word openxml openxml-sdk
1个回答
0
投票

使用生产力工具,发现oleobjectx.bin未复制,因此我在复制Blip和ImageData之后添加以下代码:

clonedParagrph.Descendants<OleObject>().ToList().ForEach(ole =>
{
    var newRelation = document.CopyOleObject(ole.Id, this.wordDocument);
    ole.Id = newRelation;
});

已解决问题。

© www.soinside.com 2019 - 2024. All rights reserved.