如何使用Apache POI清空一个XWPFDocument(DOCX)中的所有页眉和页脚?

问题描述 投票:0回答:1

下面的Java代码已经成功地删除了一个特定DOCX文件中的所有页眉和页脚中的内容,除了一个页脚(它是第一页的页脚)。经检查DOCX,这个调皮的页脚有如下XML。你将如何删除其内容?

document = new XWPFDocument(new FileInputStream(filePath));

List<XWPFHeader> headers = document.getHeaderList();

for (XWPFHeader h : headers) {

    ArrayList<XWPFParagraph> hParaArray = new ArrayList<XWPFParagraph>();
    for (XWPFParagraph hPara : h.getParagraphs())
        hParaArray.add(hPara);
    hParaArray.forEach(hPara -> {
        h.removeParagraph(hPara);
    });
    ArrayList<XWPFTable> hTblArray = new ArrayList<XWPFTable>();
    for (XWPFTable hTbl : h.getTables())
        hTblArray.add(hTbl);
    hTblArray.forEach(hTbl -> {
        h.removeTable(hTbl);
    });

}

List<XWPFFooter> footers = document.getFooterList();

for (XWPFFooter f : footers) {

    ArrayList<XWPFParagraph> fParaArray = new ArrayList<XWPFParagraph>();

    for (XWPFParagraph fPara : f.getParagraphs())
        fParaArray.add(fPara);
    fParaArray.forEach(fPara -> {
        f.removeParagraph(fPara);
    });
    ArrayList<XWPFTable> fTblArray = new ArrayList<XWPFTable>();
    for (XWPFTable fTbl : f.getTables())
        fTblArray.add(fTbl);
    fTblArray.forEach(fTbl -> {
        f.removeTable(fTbl);
    });
}

footer3.xml。

<?xml version="1.0" encoding="UTF-8"?>
<w:ftr mc:Ignorable="w14 w15 w16se wp14" xmlns:wpc="http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas" xmlns:cx="http://schemas.microsoft.com/office/drawing/2014/chartex" xmlns:cx1="http://schemas.microsoft.com/office/drawing/2015/9/8/chartex" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" xmlns:w15="http://schemas.microsoft.com/office/word/2012/wordml" xmlns:w16se="http://schemas.microsoft.com/office/word/2015/wordml/symex" xmlns:wpg="http://schemas.microsoft.com/office/word/2010/wordprocessingGroup" xmlns:wpi="http://schemas.microsoft.com/office/word/2010/wordprocessingInk" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml" xmlns:wps="http://schemas.microsoft.com/office/word/2010/wordprocessingShape">
    <w:sdt>
        <w:sdtPr>
            <w:rPr>
                <w:rFonts w:cs="Arial" />
                <w:color w:val="0000FF" />
                <w:sz w:val="16" />
                <w:szCs w:val="16" />
                <w:lang w:val="en_US" />
            </w:rPr>
            <w:id w:val="6695195" />
            <w:placeholder>
                <w:docPart w:val="68B9E76BF9434A3FAABE5342BB8B54F7" />
            </w:placeholder>
        </w:sdtPr>
        <w:sdtEndPr />
        <w:sdtContent>
            <w:p w:rsidR="00A47874" w:rsidRPr="004D34A5" w:rsidRDefault="00945F6E" w:rsidP="00A47874">
                <w:pPr>
                    <w:pBdr>
                        <w:top w:val="single" w:sz="4" w:space="1" w:color="auto" />
                    </w:pBdr>
                    <w:rPr>
                        <w:rFonts w:cs="Arial" />
                        <w:color w:val="0000FF" />
                        <w:sz w:val="16" />
                        <w:szCs w:val="16" />
                        <w:lang w:val="en_US" />
                    </w:rPr>
                </w:pPr>
                <w:r>
                    <w:rPr>
                        <w:rFonts w:cs="Arial" />
                        <w:color w:val="0000FF" />
                        <w:sz w:val="16" />
                        <w:szCs w:val="16" />
                        <w:lang w:val="en_US" />
                    </w:rPr>
                    <w:t>Some text that couldn't be removed</w:t>
                </w:r>
            </w:p>
        </w:sdtContent>
    </w:sdt>
</w:ftr>
java apache-poi footer docx xwpf
1个回答
2
投票

w:sdt 在您的页脚是一个 StructuredDocumentTag 又名 内容控制. Apache POI 只有实验类 XWPFSDT 为此。虽然它提供了 removeParagraphremoveTable它缺乏 removeSDT 到现在 XWPFHeaderFooter 以及在 XWPFDocument. 所以,使用你的方法,就没有办法去除。StructuredDocumentTag的内容。

但如果需要完全清空所有现有的页眉和页脚,那么可以简单地用新的空页眉和页脚覆盖所有页眉和页脚的内容,并使用 XWPFHeaderFooter.setHeaderFooter.

例如:

import java.io.FileInputStream;
import java.io.FileOutputStream;

import org.apache.poi.xwpf.usermodel.*;

public class WordDoEmptyingHeaderFooter {

 public static void main(String[] args) throws Exception {

  String inFilePath = "./WordDocument.docx";
  String outFilePath = "./WordDocumentNew.docx";

  XWPFDocument document = new XWPFDocument(new FileInputStream(inFilePath));

  for (XWPFHeader header : document.getHeaderList()) {
   header.setHeaderFooter(org.openxmlformats.schemas.wordprocessingml.x2006.main.CTHdrFtr.Factory.newInstance());
  }
  for (XWPFFooter footer : document.getFooterList()) {
   footer.setHeaderFooter(org.openxmlformats.schemas.wordprocessingml.x2006.main.CTHdrFtr.Factory.newInstance());
  }

  FileOutputStream out = new FileOutputStream(outFilePath);
  document.write(out);
  out.close();
  document.close();
 }

}
© www.soinside.com 2019 - 2024. All rights reserved.