NLP变形金刚：获得嵌入矢量大小的固定单词的最佳方法？

Question

我正在从火炬中心加载语言模型（CamemBERT一种基于法国RoBERTa的法语模型，并使用它嵌入一些句子：

import torch
camembert = torch.hub.load('pytorch/fairseq', 'camembert.v0')
camembert.eval()  # disable dropout (or leave in train mode to finetune)


def embed(sentence):
 tokens = camembert.encode(sentence)
 # Extract all layer's features (layer 0 is the embedding layer)
 all_layers = camembert.extract_features(tokens, return_all_hiddens=True)
 embeddings = all_layers[0]
 return embeddings

# Here we see that the shape of the embedding vector is dependent to number of tokens in the sentence

u = embed("Bonjour, ça va ?")
u.shape # torch.Size([1, 7, 768])
v = embed("Salut, comment vas-tu ?")
v.shape # torch.Size([1, 9, 768])

现在想象，我想计算向量之间的cosine distance（在我们的情况下为张量）u和v：

cos = torch.nn.CosineSimilarity(dim=0)
cos(u, v) #will throw an error since the shape of `u` is different from the shape of `v``

我在问什么是最好的方法，以便始终获得句子的相同嵌入形状，而不考虑标记的数量？

我想计算mean on axis=1，因为axis = 0和axis = 2的大小始终相同：


cos = torch.nn.CosineSimilarity(dim=1) #dim becomes 1 now

u = u.mean(axis=1)
v = v.mean(axis=1)

cos(u, v).detach().numpy().item() # works now and gives 0.7269

但是，恐怕在计算均值时会损害嵌入！

Answer 1

我不是专家，但是，为什么不使用最后一层呢？您是否要保留所有图层的目的？

对于最后一层，大小为常数[1、10、768]，它应允许您进行一些计算。我还没有尝试用它来聚集一些句子。

让我知道我是否对您有所帮助！

NLP变形金刚：获得嵌入矢量大小的固定单词的最佳方法？

问题描述投票：1回答：1

1个回答

最新问题

NLP变形金刚：获得嵌入矢量大小的固定单词的最佳方法？

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1