替换 Keras 模型中的嵌入层，同时保持其余架构和权重不变

Question

我在 gensim word2vec 库的帮助下创建了一个 Keras 序列神经网络，用于对 twitter 数据进行情感分析。我通过在更大的推文文本语料库上训练 word2vec 模型然后在 Keras 嵌入层中使用嵌入来创建嵌入字典。这可以正常工作。

虽然，为了在随机数据集上多次运行模型并获得更稳健的结果，我想在其他数据上预训练模型。从这个预训练中，我想保存模型的架构和权重，但是在这个新数据集上安装了一个新的嵌入层。

简而言之：如何加载具有架构和权重的预训练 Keras 模型，然后将嵌入层替换为在新数据集上创建的新嵌入层/嵌入矩阵？

这就是模型的创建方式：

`from tensorflow import keras`
`from keras.preprocessing.text import Tokenizer`
`from keras.utils import pad_sequences`
`from keras.models import Sequential`
`from keras import layers`

`maxlen= 100`
`EMBEDDING_DIM = 100`
`embedding_layer = layers.Embedding(len(word_index) + 1,
                                EMBEDDING_DIM,
                                weights=[embedding_matrix],
                                input_length=maxlen,
                                trainable=True)`
`model = Sequential()`
`model.add(embedding_layer)`
`model.add(layers.Flatten())`
`model.add(layers.Dense(10, activation='relu'))`
`model.add(layers.Dense(3, activation='softmax'))`
`model.compile(optimizer='adam',
                  loss='categorical_crossentropy',
              metrics=['accuracy'])`

我试过像这样加载预训练模型：

`pretrained_model = keras.models.load_model("my_model")`
`weights = pretrained_model.get_weights()`
`model.set_weights(weights)`

但是我对每一层的权重数量有疑问。加载的模型和新模型中的权重数量不对应，即使它们是以相同的方式创建的。唯一的区别是嵌入层。

我不确定保存和加载函数是否会造成不匹配，或者问题是否出在 word2vec 嵌入字典上。嵌入是使用相同的词汇大小创建的。

这就是嵌入矩阵的创建方式：

`model = Word2Vec(sentences=texts, max_final_vocab=1000)`


`tokenizer = Tokenizer(num_words=vocab_size)`
`tokenizer.fit_on_texts(df.iloc[:,7])`
`tokenized = tokenizer.texts_to_sequences(df.iloc[:,7])`
`word_index = tokenizer.word_index`

`EMBEDDING_DIM = 100`
`embedding_matrix = np.zeros((len(word_index) + 1, EMBEDDING_DIM))`
`for word, i in word_index.items():`
`  embedding_vector = embeddings_index.get(word)`
`    if embedding_vector is not None:`
`      embedding_matrix[i] = embedding_vector`

替换 Keras 模型中的嵌入层，同时保持其余架构和权重不变

问题描述投票：0回答：0

最新问题

替换 Keras 模型中的嵌入层，同时保持其余架构和权重不变

问题描述 投票：0回答：0

最新问题

问题描述投票：0回答：0