我正在尝试将一些希伯来语文本添加到Word文档中,并且可以正常工作,但是当我添加标点符号时,会变得凌乱。
这是我运行的代码:
public static void main(String[] args) throws Exception {
XWPFDocument document = new XWPFDocument();
XWPFParagraph paragraph = document.createParagraph();
paragraph.setAlignment(ParagraphAlignment.LEFT);
// make RTL direction
CTP ctp = paragraph.getCTP();
CTPPr ctppr;
if ((ctppr = ctp.getPPr()) == null) {
ctppr = ctp.addNewPPr();
}
ctppr.addNewBidi().setVal(STOnOff.ON);
XWPFRun run = paragraph.createRun();
run.setText("שלום עולם !");
// create the document in the specific path by giving it a name
File newFile = new File("helloWorld.docx");
// insert document to newFile
try {
FileOutputStream output = new FileOutputStream(newFile);
document.write(output);
output.close();
document.close();
} catch (Exception e) {
e.printStackTrace();
}
}
这是我得到的“ helloWorld.docx”:
这就是它的样子:
此外,我希望整个文档都是RTL(即使是双向文档),而不仅仅是特定段落。
感谢您的帮助!
这是使用双向文本的众所周知的问题。感叹号以及空格本身不是从右到左的字符。因此,如果需要,我们需要对其进行标记。 RIGHT-TO-LEFT MARK (RLM)
是U+200F
。参见https://en.wikipedia.org/wiki/Bidirectional_text#Table_of_possible_BiDi_character_types。
以下代码对我有用:
import java.io.FileOutputStream;
import org.apache.poi.xwpf.usermodel.*;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTP;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTPPr;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.STOnOff;
public class CreateWordRTLParagraph {
public static void main(String[] args) throws Exception {
XWPFDocument doc= new XWPFDocument();
XWPFParagraph paragraph = doc.createParagraph();
CTP ctp = paragraph.getCTP();
CTPPr ctppr;
if ((ctppr = ctp.getPPr()) == null) ctppr = ctp.addNewPPr();
ctppr.addNewBidi().setVal(STOnOff.ON);
XWPFRun run = paragraph.createRun();
run.setText("שלום עולם \u200F!\u200F");
FileOutputStream out = new FileOutputStream("WordDocument.docx");
doc.write(out);
out.close();
doc.close();
}
}
注意\u200F
标记之后空格和感叹号。
如果文本行来自文件,则标记单个字符将不是最佳实践。然后,整个文本行应标记为从右到左的文本。为此,我们可以将文本行嵌入到U+202B RIGHT-TO-LEFT EMBEDDING (RLE)
之后再插入U+202C POP DIRECTIONAL FORMATTING (PDF)
。
示例:
import java.io.File;
import java.io.FileOutputStream;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import org.apache.poi.xwpf.usermodel.*;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTP;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTPPr;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.STOnOff;
import java.util.List;
public class CreateWordRTLParagraphsFromFile {
public static void main(String[] args) throws Exception {
List<String> lines = Files.readAllLines(new File("HebrewTextFile.txt").toPath(), StandardCharsets.UTF_8);
XWPFDocument doc= new XWPFDocument();
for (String line : lines) {
XWPFParagraph paragraph = doc.createParagraph();
CTP ctp = paragraph.getCTP();
CTPPr ctppr = ctp.getPPr();
if (ctppr == null) ctppr = ctp.addNewPPr();
ctppr.addNewBidi().setVal(STOnOff.ON);
XWPFRun run = paragraph.createRun();
run.setText("\u202E" + line + "\u202C");
}
FileOutputStream out = new FileOutputStream("WordDocument.docx");
doc.write(out);
out.close();
doc.close();
}
}