使用bert-tiny-chinese模式得到不同的embedding结果,即使是eval模式

问题描述 投票:0回答:0

这是我的代码:


import os
import sys
import numpy as np
import pandas as pd
from transformers import BertTokenizerFast, AutoModel
from time import time
import torch

input_data=[
'I am happy'
]
print(input_data)

tokenizer = BertTokenizerFast.from_pretrained('bert-base-chinese')
bert_model = AutoModel.from_pretrained('ckiplab/bert-tiny-chinese')
#bert_model = AutoModel.from_pretrained('bert-base-chinese')

_ = bert_model.eval()
_ = bert_model.to("cpu")

tokenized_data = tokenizer(input_data,
                           truncation=True,
                           padding='max_length',
                           max_length=128,
                           return_tensors='pt')

with torch.no_grad():
    bert_outputs = bert_model(
        input_ids=tokenized_data['input_ids'],
        token_type_ids=tokenized_data['token_type_ids'],
        attention_mask=tokenized_data['attention_mask']
    )
    embs_ = bert_outputs.pooler_output.cpu().detach().numpy().tolist()
    print(embs_)

我改用了bert-base-chinese模型,没问题,每次都得到相同的结果 是不是bert-tiny-chinese有什么问题,比如有什么随机操作?

pytorch huggingface-transformers bert-language-model embedding inference
© www.soinside.com 2019 - 2024. All rights reserved.