APACHE POI:向现有Word文档添加注释

问题描述 投票:0回答:1

我正在开发一个遗留产品,其中需要实现一些功能。我正在尝试使用 apache-poi 5.2.2 根据搜索条件向现有 Word 文档添加注释。基本上,如果 docx 文档中的单词与操作中定义的原始文本匹配,则需要添加注释。

我已经能够向文档添加评论。

但是,我无法在评论中添加评论范围的开始和结束(在需要评论的文本处)。我假设它也需要某种形式的注释。例如,当我使用带有预先存在的注释的文档时,我注意到该位置的文本如下所示:

<w:commentRangeStart w:id="0"/><w:r><w:rPr><w:b/><w:sz w:val="27"/></w:rPr><w:t>Júlio</w:t></w:r><w:r><w:rPr><w:b/><w:spacing w:val="-3"/><w:sz w:val="27"/></w:rPr><w:t xml:space="preserve"> </w:t></w:r><w:r><w:rPr><w:b/><w:sz w:val="27"/></w:rPr><w:t>César</w:t></w:r><w:commentRangeEnd w:id="0"/><w:r w:rsidR="00B43339"><w:rPr><w:rStyle w:val="CommentReference"/></w:rPr><w:commentReference w:id="0"/></w:r><w:r>

据我所知,评论的 XML 位于单独的 CommentsDocument 中,如下所示:

        //<xml-fragment w:id="0" w:author="<NAME OF COMMENT CREATOR>" w:date="2024-03-13T10:11:00Z" w:initials="<HERE COME THE INITIALS>" xmlns:wpc="http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas" xmlns:cx="http://schemas.microsoft.com/office/drawing/2014/chartex" xmlns:cx1="http://schemas.microsoft.com/office/drawing/2015/9/8/chartex" xmlns:cx2="http://schemas.microsoft.com/office/drawing/2015/10/21/chartex" xmlns:cx3="http://schemas.microsoft.com/office/drawing/2016/5/9/chartex" xmlns:cx4="http://schemas.microsoft.com/office/drawing/2016/5/10/chartex" xmlns:cx5="http://schemas.microsoft.com/office/drawing/2016/5/11/chartex" xmlns:cx6="http://schemas.microsoft.com/office/drawing/2016/5/12/chartex" xmlns:cx7="http://schemas.microsoft.com/office/drawing/2016/5/13/chartex" xmlns:cx8="http://schemas.microsoft.com/office/drawing/2016/5/14/chartex" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:aink="http://schemas.microsoft.com/office/drawing/2016/ink" xmlns:am3d="http://schemas.microsoft.com/office/drawing/2017/model3d" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:oel="http://schemas.microsoft.com/office/2019/extlst" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" xmlns:w15="http://schemas.microsoft.com/office/word/2012/wordml" xmlns:w16cex="http://schemas.microsoft.com/office/word/2018/wordml/cex" xmlns:w16cid="http://schemas.microsoft.com/office/word/2016/wordml/cid" xmlns:w16="http://schemas.microsoft.com/office/word/2018/wordml" xmlns:w16sdtdh="http://schemas.microsoft.com/office/word/2020/wordml/sdtdatahash" xmlns:w16se="http://schemas.microsoft.com/office/word/2015/wordml/symex" xmlns:wpg="http://schemas.microsoft.com/office/word/2010/wordprocessingGroup" xmlns:wpi="http://schemas.microsoft.com/office/word/2010/wordprocessingInk" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml" xmlns:wps="http://schemas.microsoft.com/office/word/2010/wordprocessingShape">
        //  <w:p w14:paraId="4A10B938" w14:textId="77777777" w:rsidR="00B43339" w:rsidRDefault="00B43339" w:rsidP="00B43339">
        //    <w:r>
        //      <w:rPr>
        //        <w:rStyle w:val="CommentReference"/>
        //      </w:rPr>
        //      <w:annotationRef/>
        //    </w:r>
        //    <w:r>
        //      <w:rPr>
        //        <w:color w:val="000000"/>
        //        <w:sz w:val="20"/>
        //        <w:szCs w:val="20"/>
        //      </w:rPr>
        //      <w:t>This is a pre-annotation existing comment</w:t>
        //    </w:r>
        //  </w:p>
        //</xml-fragment>

考虑到其中一些帖子(向特定单词添加注释或使用 Apache POI 在 docx 文档中运行)我尝试了一些方法:

  • 只需在运行时添加注释,在那里添加注释范围开始和注释范围结束;
  • 为每段添加评论。

我猜事情需要两者结合。现在我想专注于添加对特定单词的评论。

if(paragraph.getText().contains(action.getOriginalText())){
                //SINCE NOT THE ENTIRE PARAGRAPH NEEDS TO BE ANNOTATED, WE NEED TO LOOK AT THE RUNS INSIDE THE PARAGRAPH
                for(int runIndex = internalParagraphRunIndex; runIndex < paragraph.getRuns().size(); runIndex++) {
                    XWPFRun run = paragraph.getRuns().get(runIndex);

                    if (run.text().equals(action.getOriginalText())) {
                        

                        //THE ENTIRE RUN NEEDS TO BE ANNOTATED
                        throw new RuntimeException("Not yet implemented");
                    } else if (run.text().contains(action.getOriginalText())) {
                        System.out.println("Part of the run needs to be annotated");
                        //THE TEXT THAT NEEDS TO BE ANNOTATED IS PART OF THE RUN
                        //Getting the comments from the document
                        XWPFComments comments = wordDocument.getDocComments();

                        CTComments existingCtComments = comments.getCtComments();

                        //Creating CTComment
                        CTComment newCTComment = existingCtComments.addNewComment();
                        newCTComment.setId(getCommentId(existingCtComments));

                        String[] splittedText = splitRunTextIntoParts(run, action.getOriginalText());

                        int indexOfTextThatNeedsToBeAnnotatedInSplittedText = findLocationOfTextInSplittedText(splittedText, action.getOriginalText());

                        for (int z = 0; z < splittedText.length; z++) {
                            if (indexOfTextThatNeedsToBeAnnotatedInSplittedText == -1) {
                                throw new RuntimeException("The text that needs to be annotated is not found in the splitted run.");
                            } else {
                                if (z == indexOfTextThatNeedsToBeAnnotatedInSplittedText) {
                                    //the exact word that needs to be annotated
                                    XWPFRun runToInsert = paragraph.insertNewRun(runIndex + z);

                                    //insert part of the text of the run
                                    runToInsert.setText(splittedText[z]);

                                    paragraph.getCTP().addNewCommentRangeEnd().setId(newCTComment.getId());

                                    //add the comment reference AFTER the text
                                    runToInsert.getCTR().addNewCommentReference().setId(newCTComment.getId());

                                    //TODO: remove styling
                                    runToInsert.setBold(true);

                                } else if (z == indexOfTextThatNeedsToBeAnnotatedInSplittedText - 1) {

                                    XWPFRun runToInsert = paragraph.insertNewRun(runIndex + z);

                                    //insert part of the text of the run
                                    runToInsert.setText(splittedText[z]);

                                    //add the range start after the text of the run
                                    CTMarkupRange rangeStartMarkupRange = paragraph.getCTP().addNewCommentRangeStart();
                                    rangeStartMarkupRange.setId(newCTComment.getId());
                                    newCTComment.setCommentRangeStartArray(new CTMarkupRange[]{rangeStartMarkupRange});

                                    //TODO: remove styling
                                    runToInsert.setItalic(true);


                                } else if (z == indexOfTextThatNeedsToBeAnnotatedInSplittedText + 1) {


                                    //add the range end before the text of the run
                                    CTMarkupRange rangeEndeMarkupRange = paragraph.getCTP().addNewCommentRangeEnd();
                                    rangeEndeMarkupRange.setId(newCTComment.getId());

                                    newCTComment.setCommentRangeEndArray(new CTMarkupRange[]{rangeEndeMarkupRange});
                                    //newCTComment.setCommentRangeEndArray(new CTMarkupRange[]{rangeEndeMarkupRange});

                                    //insert new run
                                    XWPFRun runToInsert = paragraph.insertNewRun(runIndex + z);
                                    //insert part of the text of the run
                                    runToInsert.setText(splittedText[z]);

                                    //TODO: remove styling
                                    runToInsert.setUnderline(UnderlinePatterns.SINGLE);
                                }
                            }
                        }

                        //remove original run
                        paragraph.removeRun(runIndex + splittedText.length); //the  new runs are put in front of the old run

                        //Creating the new XWPFComment based on the CTComment
                        XWPFComment newComment = new XWPFComment(newCTComment, comments);
                        newComment.setAuthor("Teradactor");
                        newComment.setDate(new GregorianCalendar());
                        newComment.setInitials("TD");
                        newComment.createParagraph().createRun().setText(action.getAnnotationText());


                        comments.createComment(BigInteger.valueOf(Long.parseLong(newComment.getId())));


                        setParagraphIndex(paragraphToLookAt);
                        setInternalParagraphRunIndex(runIndex + 3); // the next time we want to start from the runs after th, since this one is already annotated
                        setActionFound(true);
                        //because we are splitting the run into several runs, we need to decrease the index of the tnsmap text

                        break;

                    }

该方法成功在需要注释的单词后面添加注释引用。我还可以看到需要注释的单词之前或之后的文本已设置样式(斜体或下划线),并且单词本身是粗体的。然而,这个词本身并没有适当的注释参考。如果需要注释参考,最好了解以及如何设置它。

java ms-word apache-poi docx doc
1个回答
0
投票

所以你的主要问题是:

如何注释现有 Word 文档中的单个文本部分(无论是已经运行的单个文本还是在长文本运行中)?

这个范围非常广泛。太宽泛了,无法在这里回答。 要获取单个文本部分作为自己的文本运行,请参阅如何使用 Apache POI 突出显示替换的单词。请阅读 值“name”和“surname”不读取 apache poiApache POI:${my_placeholder} 也被视为三个不同的运行,因为这提供了

XWPFParagraph.searchText
的错误修复。

除了如何获取单个文本部分作为自己的文本运行的问题之外,如何评论单个文本运行的问题可以这样回答:

每个注释文本运行看起来都像这样:

word/document.xml
:

...
<w:commentRangeStart w:id="1"/>
<w:r>
 <w:rPr>
 ...
 </w:rPr>
 <w:t>run text</w:t>
</w:r>
<w:commentRangeEnd w:id="1"/>
<w:r>
 <w:commentReference w:id="1"/>
</w:r>
..

在文本运行之前有一个

commentRangeStart 
,在文本运行之后有一个
commentRangeEnd 
,紧接着是仅包含
commentReference
的文本运行。

要创建此内容,我们需要使用

org.openxmlformats.schemas.wordprocessingml.x2006.main.*
类和本机 XML 方法(如
org.apache.xmlbeans.XmlCursor
),因为 Apache POI 不提供在
XWPF
中执行此操作的方法。

完整的示例,只需注释每个单独的文本运行即可表明其有效。

import java.io.*;

import org.apache.poi.xwpf.usermodel.*;

import org.openxmlformats.schemas.wordprocessingml.x2006.main.*;
import org.apache.xmlbeans.XmlCursor;

import java.math.BigInteger;
import java.util.GregorianCalendar;
import java.util.Locale;

public class WordCommentTextRuns {

 //method to get or create the CommentsDocument /word/comments.xml in the *.docx ZIP archive  
 private static XWPFComments createCommentsDocument(XWPFDocument document) throws Exception {
  XWPFComments commentsDocument = null;
  //trying to get the CommentsDocument
  commentsDocument = document.getDocComments();
  //create a new CommentsDocument if there is not one already
  if (commentsDocument == null) {
   commentsDocument = document.createComments();  
   System.out.println("comments document created");
  }
  return commentsDocument;
 }

 //method to get the next comment Id from CTComments
 private static BigInteger getCommentId(CTComments comments) {
  BigInteger cId = BigInteger.ZERO;
  for (CTComment ctComment : comments.getCommentList()) {
   if (ctComment.getId().compareTo(cId) == 1) {
    cId = ctComment.getId();
   }
  }
  cId = cId.add(BigInteger.ONE);
  return cId;
 }

 //method to set CommentRangeStart before text run
 private static CTMarkupRange insertCommentRangeStartBefore(XWPFRun run) {
  String uri = CTMarkupRange.type.getName().getNamespaceURI();
  String localPart = "commentRangeStart";
  XmlCursor cursor = run.getCTR().newCursor();
  cursor.beginElement(localPart, uri);
  cursor.toParent();
  CTMarkupRange commentRangeStart = (CTMarkupRange)cursor.getObject();
  return commentRangeStart;  
 }
 
 //method to set CommentRangeEnd after text run
 private static CTMarkupRange insertCommentRangeEndAfter(XWPFRun run) {
  String uri = CTMarkupRange.type.getName().getNamespaceURI();
  String localPart = "commentRangeEnd";
  XmlCursor cursor = run.getCTR().newCursor();
  cursor.toEndToken();
  cursor.toNextToken();
  cursor.beginElement(localPart, uri);
  cursor.toParent();
  CTMarkupRange commentRangeStart = (CTMarkupRange)cursor.getObject();
  return commentRangeStart;  
 }

 //method to set CommentReference after CommentRangeEnd 
 private static void insertCommentReferenceAfter(CTMarkupRange commentRangeEnd, BigInteger cId) {
  String uri = CTR.type.getName().getNamespaceURI();
  String localPart = "r";
  XmlCursor cursor = commentRangeEnd.newCursor();
  cursor.toEndToken();
  cursor.toNextToken();
  cursor.beginElement(localPart, uri);
  cursor.toParent();
  CTR ctr = (CTR)cursor.getObject();
  ctr.addNewCommentReference().setId(cId);
 }

 //method to comment single text runs
 private static void commentTextRun(XWPFRun run, CTComments comments, String commentText) {
  CTComment ctComment;
  
  //comment for the run
  BigInteger cId = getCommentId(comments);
  ctComment = comments.addNewComment();
  ctComment.setAuthor("Axel Ríchter");
  ctComment.setInitials("AR");
  ctComment.setDate(new GregorianCalendar(Locale.US));
  ctComment.addNewP().addNewR().addNewT().setStringValue(commentText);
  ctComment.setId(cId);

  //set CommentRangeStart
  CTMarkupRange commentRangeStart = insertCommentRangeStartBefore(run);
  commentRangeStart.setId(cId);

  //set CommentRangeEnd and CommentReference
  CTMarkupRange commentRangeEnd = insertCommentRangeEndAfter(run);
  commentRangeEnd.setId(cId);
  insertCommentReferenceAfter(commentRangeEnd, cId); 
 }

 public static void main(String[] args) throws Exception {

  XWPFDocument document = new XWPFDocument(new FileInputStream("./WordDocument.docx"));

  XWPFComments commentsDocument = createCommentsDocument(document);
  CTComments comments = commentsDocument.getCtComments();
    
  for (XWPFParagraph paragraph : document.getParagraphs()) {
   for (XWPFRun run : paragraph.getRuns()) {
    // simply comment each single text run to show that it works
    commentTextRun(run, comments, "Comment text");
   }
  }

  FileOutputStream out = new FileOutputStream("./WordDocumentWithComments.docx");
  document.write(out);
  out.close();
  document.close();

 }
}
© www.soinside.com 2019 - 2024. All rights reserved.