我遵循了有关如何使用NeuralCoref进行train a a neural coreference建模的准则。我现在有一个模型,但无法弄清楚如何在Spacy中使用coref模型。
手册中显示的以下内容未介绍如何加载自定义模型:
# Load your usual SpaCy model (one of SpaCy English models)
import spacy
nlp = spacy.load('custom-danish-spacy-model')
# Add neural coref to SpaCy's pipe
import neuralcoref
neuralcoref.add_to_pipe(nlp)
# You're done. You can now use NeuralCoref as you usually manipulate a SpaCy document annotations.
doc = nlp(u'A sentence in Danish. Another sentence in the same language.')
编辑:我试图将经过训练的模型(通过运行python -m neuralcoref.train.learn --train ./data/train/ --eval ./data/dev/
生成)放在NeuralCoref缓存文件夹中,然后运行上面的代码。给出了以下错误:
return f(*args, **kwds)
/home/johan/Code/spacy-neuralcoref/venv/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: spacy.vocab.Vocab size changed, may indicate binary incompatibility. Expected 96 from C header, got 104 from PyObject
return f(*args, **kwds)
Traceback (most recent call last):
File "custom_model_test.py", line 5, in <module>
neuralcoref.add_to_pipe(nlp)
File "/home/johan/Code/spacy-neuralcoref/neuralcoref/neuralcoref/__init__.py", line 42, in add_to_pipe
coref = NeuralCoref(nlp.vocab, **kwargs)
File "neuralcoref.pyx", line 554, in neuralcoref.neuralcoref.NeuralCoref.__init__
File "neuralcoref.pyx", line 947, in neuralcoref.neuralcoref.NeuralCoref.from_disk
File "/home/johan/Code/spacy-neuralcoref/venv/lib/python3.6/site-packages/thinc/neural/_classes/model.py", line 355, in from_bytes
data = srsly.msgpack_loads(bytes_data)
File "/home/johan/Code/spacy-neuralcoref/venv/lib/python3.6/site-packages/srsly/_msgpack_api.py", line 29, in msgpack_loads
msg = msgpack.loads(data, raw=False, use_list=use_list)
File "/home/johan/Code/spacy-neuralcoref/venv/lib/python3.6/site-packages/srsly/msgpack/__init__.py", line 60, in unpackb
return _unpackb(packed, **kwargs)
File "_unpacker.pyx", line 199, in srsly.msgpack._unpacker.unpackb
srsly.msgpack.exceptions.ExtraData: unpack(b) received extra data.
您的问题是msgpack
使用量多于SpaCy。建议您查看Unpacking msgpack from respond in python,以更清楚地了解问题的根源。
在您遇到的情况下,模型存储代码和加载代码之间的格式很可能不兼容,但是没有用于训练/存储模型的SpaCy和NeuralCoref
代码的确切版本,很难准确说明。