在段落中寻找句子的起点和终点 StanfordCoreNLP

Question

我想知道如何使用StanfordCoreNLP找到段落中句子的开始和结束位置。现在我正在使用DocumentPreprocessor将段落分割成句子。是否可以得到句子在原文中实际位置的开始和结束索引？

我使用的是在这里提出的另一个问题的代码。

String paragraph = "My 1st sentence. “Does it work for questions?” My third sentence.";
Reader reader = new StringReader(paragraph);
DocumentPreprocessor dp = new DocumentPreprocessor(reader);
List<String> sentenceList = new ArrayList<String>();

for (List<HasWord> sentence : dp) {
   String sentenceString = Sentence.listToString(sentence);
   sentenceList.add(sentenceString.toString());
}

for (String sentence : sentenceList) {
   System.out.println(sentence);
}

摘自我如何使用斯坦福分析器将文本分割成句子？

谢谢你

Answer 1

快速和肮脏的方法是。

import edu.stanford.nlp.simple.*;

Document doc = new Document("My 1st sentence. “Does it work for questions?” My third sentence.");
for (Sentence sentence : doc.sentences()) {
  System.out.println(sentence.characterOffsetBegin(0) + " -- " + sentence.characterOffsetEnd(sentence.length() - 1));
}

否则，你可以提取 CharacterOffsetBeginAnnotation 和 CharacterOffsetEndAnnotation 从CoreLabel中找到标记的偏移量，并使用该偏移量在原文中找到标记的偏移量。

Answer 2

请看 https:/www.programcreek.comjava-api-examples?api=edu.stanford.nlp.ling.CoreLabel 获取CharacterOffsetEndAnnotation的例子。

在段落中寻找句子的起点和终点 StanfordCoreNLP

问题描述投票：0回答：2

2个回答

最新问题

在段落中寻找句子的起点和终点 StanfordCoreNLP

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2