在 JavaScript 中将 html 导出到 Docx - 获取损坏的 Word 文件

Question

我需要更新一些将 HTML 导出到 Word 的 JavaScript，以便该文件为 .docx 而不是 .doc。我无法使用任何 API 或库来完成此任务。我正在使用普通的 JavaScript 和 JQuery。当前代码如下。我获得此代码片段的来源指出“默认情况下，word 文件将保存为 .doc 文件。如果要将 word 文件导出为 .docx 文件，请在文件名中指定扩展名。” （https://www.codexworld.com/export-html-to-word-doc-docx-using-javascript/）当我将文件名更改为“.docx”时，下载的 Word 文档已损坏。我感谢您能够提供的任何帮助。

function Export2Word(element, filename = ''){
    var preHtml = "<html xmlns:o='urn:schemas-microsoft-com:office:office' xmlns:w='urn:schemas-microsoft-com:office:word' xmlns='http://www.w3.org/TR/REC-html40'><head><meta charset='utf-8'><title>Export HTML To Doc</title></head><body>";
    var postHtml = "</body></html>";
    var html = preHtml+document.getElementById(element).innerHTML+postHtml;

    var blob = new Blob(['\ufeff', html], {
        type: 'application/msword'
    });
    
    // Specify link url
    var url = 'data:application/vnd.ms-word;charset=utf-8,' + encodeURIComponent(html);
    
    // Specify file name
    filename = filename?filename+'.doc':'document.doc';
    
    // Create download link element
    var downloadLink = document.createElement("a");

    document.body.appendChild(downloadLink);
    
    if(navigator.msSaveOrOpenBlob ){
        navigator.msSaveOrOpenBlob(blob, filename);
    }else{
        // Create a link to the file
        downloadLink.href = url;
        
        // Setting the file name
        downloadLink.download = filename;
        
        //triggering the function
        downloadLink.click();
    }
    
    document.body.removeChild(downloadLink);
}

我用“.docx”文件扩展名尝试了此功能，但 Word 文件仍然损坏：

<script>
    function exportHTML(){
       var header = "<html xmlns:o='urn:schemas-microsoft-com:office:office' "+
            "xmlns:w='urn:schemas-microsoft-com:office:word' "+
            "xmlns='http://www.w3.org/TR/REC-html40'>"+
            "<head><meta charset='utf-8'><title>Export HTML to Word Document with JavaScript</title></head><body>";
       var footer = "</body></html>";
       var sourceHTML = header+document.getElementById("source-html").innerHTML+footer;
       
       var source = 'data:application/vnd.ms-word;charset=utf-8,' + encodeURIComponent(sourceHTML);
       var fileDownload = document.createElement("a");
       document.body.appendChild(fileDownload);
       fileDownload.href = source;
       fileDownload.download = 'document.docx';
       fileDownload.click();
       document.body.removeChild(fileDownload);
    }
</script>

我尝试将标题更改为：

var header = "<html xmlns:o='urn:schemas-microsoft-com:office:office' "+
            "xmlns:w='http://schemas.openxmlformats.org/wordprocessingml/2006/main' "+
            "xmlns='http://schemas.openxmlformats.org/package/2006/content-types' "+
            "<head><meta charset='utf-8'><title>Export HTML to Word Document with JavaScript</title></head><body>";```

I tried changing data to:

   var source = 'data:application/vnd.openxmlformats-officedocument.wordprocessingml.document,' + encodeURIComponent(sourceHTML);


none of the other Stack Overflow solutions have worked. Thanks again for any help you can provide

Answer 1

尝试redocx。

当我在本地计算机上运行示例时，我收到此警告，

“Word 在 HelloWorld.docx 中发现不可读的内容。是否要恢复此文档的内容？如果您信任此文档的来源，请单击“是”。”

否则它工作正常。

.docx 和 .doc 是非常不同的格式：这篇文章详细解释了其中的差异。这就是您所做的更改不起作用的原因。

在 JavaScript 中将 html 导出到 Docx - 获取损坏的 Word 文件

问题描述投票：0回答：1

1个回答

最新问题

在 JavaScript 中将 html 导出到 Docx - 获取损坏的 Word 文件

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1