如何使用StanfordCoreNLP实现中文Penn Treebank（s-expression）到CONLL格式的转换？

Question

我有Penn Treebank（s-expression）格式的中文选区树库，我想获得conll格式数据。我知道英国数据可以使用此命令由StanfordCoreNLP转换

java -mx1g edu.stanford.nlp.trees.ud.UniversalDependenciesConverter -treeFile treebank > treebank.conllu

我也知道StanfordCoreNLP支持使用命令选择中文模型

java -mx3g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLP -props StanfordCoreNLP-chinese.properties -file chinese.txt -outputFormat text

当我使用命令

java -mx3g -cp "*" edu.stanford.nlp.trees.ud.UniversalDependenciesConverter -props StanfordCoreNLP-chinese.properties -treeFile chtb_0001.nw > chtb_0001_nw.conllu

没有什么变化，在这种情况下，StanfordCoreNLP工具仍然选择英文模型而不是中文模型。我无法从StanfordCoreNLP的主页获得更多细节。所以我想从StackoverFlow获得帮助。

Answer 1

我认为这会奏效：

java -Xmx1g edu.stanford.nlp.trees.international.pennchinese.UniversalChineseGrammaticalStructure -treeFile ctb_example.txt -checkConnected -basic -keepPunct -conllx

如何使用StanfordCoreNLP实现中文Penn Treebank（s-expression）到CONLL格式的转换？

问题描述投票：-1回答：1

1个回答

最新问题

如何使用StanfordCoreNLP实现中文Penn Treebank（s-expression）到CONLL格式的转换？

问题描述 投票：-1回答：1

1个回答

最新问题

问题描述投票：-1回答：1