TreeTagger安装成功但无法打开.par文件

问题描述 投票:5回答:3

有谁知道如何在TreeTagger解决这个文件读取错误,这是一个常用的自然语言处理工具用于POS标记,lemmatize和块句?

alvas@ikoma:~/treetagger$ echo 'Hello world!' | cmd/tree-tagger-english 
        reading parameters ...

ERROR: Can't open for reading: /home/alvas/treetagger/lib/english.par
aborted.

我没有遇到任何可能的安装问题,如http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/installation-hints.txt暗示。我按照网页上的说明正确安装(http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/#Linux):

alvas@ikoma:~$ mkdir treetagger
alvas@ikoma:~$ cd treetagger
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/tree-tagger-linux-3.2.tar.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/tagger-scripts.tar.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/install-tagger.sh
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/dutch-par-linux-3.2-utf8.bin.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/german-par-linux-3.2-utf8.bin.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/italian-par-linux-3.2-utf8.bin.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/spanish-par-linux-3.2-utf8.bin.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/french-par-linux-3.2-utf8.bin.gz

alvas@ikoma:~/treetagger$ sh install-tagger.sh 

Linux version of TreeTagger installed.
Tagging scripts installed.
German parameter file (Linux, UTF8) installed.
German chunker parameter file (Linux) installed.
French parameter file (Linux, UTF8) installed.
French chunker parameter file (Linux, UTF8) installed.
Italian parameter file (Linux, UTF8) installed.
Spanish parameter file (Linux, UTF8) installed.
Dutch parameter file (Linux, UTF8) installed.
Path variables modified in tagging scripts.

You might want to add /home/alvas/treetagger/cmd and /home/alvas/treetagger/bin to the PATH variable so that you do not need to specify the full path to run the tagging scripts.

但是当我尝试测试软件时,我得到了这些错误:

alvas@ikoma:~/treetagger$ echo 'Hello world!' | cmd/tree-tagger-english 
    reading parameters ...

ERROR: Can't open for reading: /home/alvas/treetagger/lib/english.par
aborted.
alvas@ikoma:~/treetagger$ echo 'Das ist ein Test.' | cmd/tagger-chunker-german

ERROR: Can't open for reading: /home/alvas/treetagger/lib/german-chunker.par
aborted.

ERROR: Can't open for reading: /home/alvas/treetagger/lib/german.par
aborted.
    reading parameters ...

ERROR: Can't open for reading: /home/alvas/treetagger/lib/german.par
aborted.
installation nlp stemming pos-tagger lemmatization
3个回答
5
投票

我认为有两个问题:首先,脚本的名称中应该包含“-utf8”,例如: cmd/tagger-chunker-german-utf8,因为您下载了UTF-8数据。其次,标记和分块每个都需要一个数据文件。请参阅主页,其中包含“PC的参数文件”和“用于PC的Chunker参数文件”部分 - 从两个部分下载文件,然后重新执行install-tagger.sh


0
投票

你写了cmd / tree-tagger-english,但我认为正确的路径(有参数文件的地方)是:

LIB /树恶搞英语


0
投票

我有同样的问题。我意识到我为我所需的语言下载的.par文件尚未被提取(它们仍然在.gz中)。

确保先将它们解压缩到目录中,然后重试。

© www.soinside.com 2019 - 2024. All rights reserved.