无法使用gensim FastText加载模型

问题描述 投票:0回答:1

使用gensim.model.FastText.load()加载模型时遇到了麻烦。

这是我得到的一些代码和错误:

from gensim.models import FastText

class FastTextModel:
    def __init__(self, model_path, dim=300):
        self.dim = dim
        self.model = FastText.load(model_path).wv

...

class GeneralModel:
    def __init__(self, config):
        if config["type"] == "fasttext":
            # path - path to model
            # dim -  dimension, here 300
            self.model = FastTextModel(config["path"], config["dim"])
  File "/project/preprocessing/pipeline.py", line 15, in __init__
    self.model_ru = GeneralModel(config["models"]["ru"])
  File "/project/models/nlp_models.py", line 101, in __init__
    self.model = FastTextModel(config["path"], config["dim"])
  File "/project/models/nlp_models.py", line 16, in __init__
    self.model = FastText.load(model_path).wv
  File "/usr/local/lib64/python3.6/site-packages/gensim/models/fasttext.py", line 936, in load
    model = super(FastText, cls).load(*args, **kwargs)
  File "/usr/local/lib64/python3.6/site-packages/gensim/models/base_any2vec.py", line 1244, in load
    model = super(BaseWordEmbeddingsModel, cls).load(*args, **kwargs)
  File "/usr/local/lib64/python3.6/site-packages/gensim/models/base_any2vec.py", line 603, in load
    return super(BaseAny2VecModel, cls).load(fname_or_handle, **kwargs)
  File "/usr/local/lib64/python3.6/site-packages/gensim/utils.py", line 423, in load
    obj._load_specials(fname, mmap, compress, subname)
  File "/usr/local/lib64/python3.6/site-packages/gensim/utils.py", line 453, in _load_specials
    getattr(self, attrib)._load_specials(cfname, mmap, compress, subname)
  File "/usr/local/lib64/python3.6/site-packages/gensim/utils.py", line 464, in _load_specials
    val = np.load(subname(fname, attrib), mmap_mode=mmap)
  File "/usr/local/lib64/python3.6/site-packages/numpy/lib/npyio.py", line 447, in load
    pickle_kwargs=pickle_kwargs)
  File "/usr/local/lib64/python3.6/site-packages/numpy/lib/format.py", line 738, in read_array
    array.shape = shape
ValueError: cannot reshape array of size 67239904 into shape (445446,300)

我已经从Google云端硬盘文件夹中下载了模型,尽管它可能以某种方式损坏.npy文件(因为它们很大),所以我分别下载了每个文件(该模型有7个文件),但这并没有`t帮帮我。

[此外,我读到有时它可能是由于'load'方法中的未正确解压缩而引起的,但是我正在将已经解压缩的文件传递给它,因此这对我也不起作用。

将非常感谢您的帮助!

python numpy gensim
1个回答
0
投票

模型起源于哪里? gensim FastText.load()方法仅适用于通过gensim创建和保存的FastText模型(通过其.save()方法)。此类模型使用Python拾取和同级.npy原始数组文件(以存储大数组)的组合,必须将它们保存在一起。

从Facebook原始FastText实现中保存的模型是另一种格式,您可以使用load_facebook_model()实用程序功能:

https://radimrehurek.com/gensim/models/fasttext.html#gensim.models.fasttext.load_facebook_model

[如果您只需要向量,就好像您仅立即使用.wv属性所导致的情况–您也可以使用load_facebook_vectors()函数:

https://radimrehurek.com/gensim/models/fasttext.html#gensim.models.fasttext.load_facebook_vectors

((也不确定为什么要将加载的模型包装在自己的FastTextModel类中,该类允许调用者指定尺寸。您无法更改已加载模型的尺寸,因此更有意义即可从模型中读取现有的vector_size,而不是在外部进行指定。)

© www.soinside.com 2019 - 2024. All rights reserved.