首先,需要确保系统中已经安装了PyTorch和Transformers库,并且已经下载了预训练模型。
然后,定义BertTokenizer和BertModel,并加载预训练模型。
import torch
from transformers import BertTokenizer, BertModel
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertModel.from_pretrained('bert-base-uncased')
model.eval()
input_text = "This is a sample input."
input_tokens = tokenizer.encode(input_text, add_special_tokens=True)
input_tensor = torch.tensor([input_tokens])
with torch.no_grad():
model_output = model(input_tensor)
import time
start_time = time.time()
with torch.no_grad():
model_output = model(input_tensor)
inference_time = time.time() - start_time
input_size = input_tensor.element_size() * input_tensor.nelement()
print(f"Inference speed: {input_size/1024/1024/inference_time:.2f} MB/s")
如果输出结果接近预期值,则Bert Pretrained Model在PyTorch中的推理速度正常。