如何在HuggingFace Transformers库中获得经过预训练的BERT模型的中间层输出？

Question

（（我正在遵循this pytorch教程中有关BERT词嵌入的内容，在该教程中，作者正在访问BERT模型的中间层。]

我想使用HuggingFace的Transformers库访问TensorFlow2中BERT模型的单个输入令牌的最后4个最后层。因为每层输出一个长度为768的矢量，所以最后4层的形状为4*768=3072（对于每个令牌）。

我如何在TF / keras / TF2中实现这一点，以获取输入令牌的预训练模型的中间层？（稍后，我将尝试获取句子中每个标记的标记，但是现在一个标记就足够了。）

我正在使用HuggingFace的BERT模型：

!pip install transformers
from transformers import (TFBertModel, BertTokenizer)

bert_model = TFBertModel.from_pretrained("bert-base-uncased")  # Automatically loads the config
bert_tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
sentence_marked = "hello"
tokenized_text = bert_tokenizer.tokenize(sentence_marked)
indexed_tokens = bert_tokenizer.convert_tokens_to_ids(tokenized_text)

print (indexed_tokens)
>> prints [7592]

输出是令牌（[7592]），应该是BERT模型的输入。

Answer 1

BERT模型输出的第三个元素是一个元组，它由嵌入层的输出以及中间层的隐藏状态组成。从documentation：

hidden_states（tuple(tf.Tensor)，可选，在config.output_hidden_states=True时返回）：形状为tf.Tensor的(batch_size, sequence_length, hidden_size)元组（一个用于嵌入的输出+一个用于输出每一层的元组）。
每层输出加上初始嵌入输出的模型的隐藏状态。
对于bert-base-uncased型号，默认情况下config.output_hidden_states为True。因此，要访问12个中间层的隐藏状态，可以执行以下操作：

outputs = bert_model(input_ids, attention_mask)
hidden_states = outputs[2][1:]
hidden_states元组中有12个元素，从开始到最后一层对应于所有层，每个元素都是一个形状为(batch_size, sequence_length, hidden_size)的数组。因此，例如，要访问批次中所有样本的第五个令牌的第三层的隐藏状态，可以执行：hidden_states[2][:,4]。

请注意，如果要加载的模型默认情况下不返回隐藏状态，则可以使用BertConfig类并通过output_hidden_state=True参数来加载配置，如下所示：

config = BertConfig.from_pretrained("name_or_path_of_model",
                                    output_hidden_states=True)

bert_model = TFBertModel.from_pretrained("name_or_path_of_model",
                                         config=config)

如何在HuggingFace Transformers库中获得经过预训练的BERT模型的中间层输出？

问题描述投票：1回答：1

1个回答

最新问题

如何在HuggingFace Transformers库中获得经过预训练的BERT模型的中间层输出？

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1