我正在遵循变压器的预训练模型xlm-roberta-large-xnli示例
from transformers import pipeline
classifier = pipeline("zero-shot-classification",
model="joeddav/xlm-roberta-large-xnli")
我收到以下错误
ValueError: Couldn't instantiate the backend tokenizer from one of: (1) a `tokenizers` library serialization file, (2) a slow tokenizer instance to convert or (3) an equivalent slow tokenizer class to instantiate and convert. You need to have sentencepiece installed to convert a slow tokenizer to a fast one.
我正在使用变形金刚版本
'4.1.1'
根据 Transformers
v4.0.0
release,sentencepiece
已作为必需的依赖项被删除。这意味着
“标准转换器安装无法使用依赖于 SentencePiece 库的分词器”
包括
XLMRobertaTokenizer
。但是,sentencepiece
可以作为额外的依赖项安装
pip install transformers[sentencepiece]
或
pip install sentencepiece
如果您已经安装了变压器。
如果您在 Google 协作中:
下面的代码在 Colab 笔记本中对我有用
!pip install transformers[sentencepiece]
标准转换器安装无法使用依赖于 SentencePiece 库的分词器。
您应该与变压器一起额外安装
sentencepiece
pip install transformers[sentencepiece]
这对于慢速版本是必需的:
XLNetTokenizer
,
AlbertTokenizer
,
CamembertTokenizer
,
MBartTokenizer
,
PegasusTokenizer
,
T5Tokenizer
,
ReformerTokenizer
,
XLMRobertaTokenizer
或者你可以在 AutoTokenizer.frompretrained() 中设置
use_fast=False
参数