如何使用pytorch实现SciBERT;加载时出错

问题描述 投票:0回答:1

我正在尝试使用SciBERT预先训练的模型,即:scibert-scivocab-uncased的以下方式:

    !pip install pytorch-pretrained-bert 
    import torch
    from pytorch_pretrained_bert import BertTokenizer, BertModel,      BertForMaskedLM 
    import logging
    import matplotlib.pyplot as plt
    tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
    indexed_tokens = tokenizer.convert_tokens_to_ids(tokenized_text) 
    segments_ids = [1] * len(tokenized_text)
    tokens_tensor = torch.tensor([indexed_tokens])
    segments_tensors = torch.tensor([segments_ids])
    model =      BertModel.from_pretrained('/Users/.../Downloads/scibert_scivocab_uncased-3.tar.gz') 

并且我收到以下错误:

EOFError: Compressed file ended before the end-of-stream marker was reached
  1. 我从网站(https://github.com/allenai/scibert)下载了文件

  2. 我将其从“ tar”转换为gzip

没有任何作用。

关于如何处理此问题的任何提示?

谢谢!

error-handling neural-network nlp tar word-embedding
1个回答
0
投票

在新版本的pytorch-pretrained-BERT中,即在变压器中,您可以执行以下操作以在解压缩后加载预训练的模型:

导入AutoModelForTokenClassification,自动令牌生成器

模型= AutoModelForTokenClassification.from_pretrained(“ /您的/本地/路径/到/ scibert_scivocab_uncased”)

© www.soinside.com 2019 - 2024. All rights reserved.