如何加载预训练的Word2vec模型文件并重复使用？

Question

我想使用预训练的

word2vec

模型，但我不知道如何在python中加载它。

此文件是一个模型文件 (703 MB)。可以在这里下载：
http://devmount.github.io/GermanWordEmbeddings/

Answer 1

仅用于加载

import gensim

# Load pre-trained Word2Vec model.
model = gensim.models.Word2Vec.load("modelName.model")

现在您可以像平常一样训练模型了。另外，如果您希望能够保存它并多次重新训练它，那么您应该这样做

model.train(//insert proper parameters here//)
"""
If you don't plan to train the model any further, calling
init_sims will make the model much more memory-efficient
If `replace` is set, forget the original vectors and only keep the normalized
ones = saves lots of memory!
replace=True if you want to reuse the model
"""
model.init_sims(replace=True)

# save the model for later use
# for loading, call Word2Vec.load()

model.save("modelName.model")

Answer 2

使用

KeyedVectors

加载预训练模型。

from gensim.models import KeyedVectors

word2vec_path = 'path/GoogleNews-vectors-negative300.bin.gz'
w2v_model = KeyedVectors.load_word2vec_format(word2vec_path, binary=True)

Answer 3

我在代码中使用了相同的模型，由于无法加载它，所以我向作者询问了它。他的回答是模型必须以二进制格式加载：

gensim.models.KeyedVectors.load_word2vec_format(w2v_path, binary=True)

这对我有用，我认为它也应该对你有用。

Answer 4

我遇到了同样的问题，我从 Kaggle 下载了 GoogleNews-vectors-negative300。我在桌面上保存并提取了该文件。然后我用 python 实现了这段代码并且运行良好：

model = KeyedVectors.load_word2vec_format=(r'C:/Users/juana/descktop/archive/GoogleNews-vectors-negative300.bin')

如何加载预训练的Word2vec模型文件并重复使用？

问题描述投票：0回答：4

4个回答

最新问题

如何加载预训练的Word2vec模型文件并重复使用？

问题描述 投票：0回答：4

4个回答

最新问题

问题描述投票：0回答：4