还有什么解决方案可以将现有的Googlenews W2v加载到gensim并使用其他语料库进行微调吗?

问题描述 投票:2回答:1

为了微调word2vec中的gensim嵌入,以下代码与以前的版本一起使用:model = Word2Vec.load_word2vec_format('GoogleNews-vectors-negative300.bin.gz', binary=True)

但是,我收到了Word2Vec.load_word2vec被取消的错误消息:"DeprecationWarning: Deprecated. Use gensim.models.KeyedVectors.load_word2vec_format instead."当我使用

model = gensim.models.KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin.gz', binary=True)

然后尝试使用以下训练方法对模型进行微调:

model.train((corpus22, total_examples=len(corpus2),epochs=10) )

我收到以下错误:AttributeError: 'Word2VecKeyedVectors' object has no attribute 'train'

还有任何解决方案可以将现有的Googlenews W2V加载到gensim并用其他语料库进行微调吗?

python nlp gensim word2vec embedding
1个回答
0
投票

Word2Vec.load_word2vec的弃用只是一个警告,在这里不应该成为问题。

gensim.models.KeyedVectors.load_word2vec_format方法不返回模型对象,如果需要使用train方法,则需要gensim模型。尝试gensim.models.Word2Vec("GoogleNews-vectors-negative300.bin.gz")

import gensim
from gensim.models import Word2Vec

# for google colab, if you haven't already downloaded the vectors.
! wget -c "https://s3.amazonaws.com/dl4j-distribution/GoogleNews-vectors-negative300.bin.gz"

model = gensim.models.Word2Vec("GoogleNews-vectors-negative300.bin.gz")
model.train # pass the parameters to this method
© www.soinside.com 2019 - 2024. All rights reserved.