AutoModelForSeq2SeqLM和AutoModelForCausalLM都是Hugging Face库中的预训练模型。它们之间的主要区别在于生成式(AutoModelForCausalLM)和序列到序列(AutoModelForSeq2SeqLM)任务的不同。
AutoModelForSeq2SeqLM用于序列到序列任务,如翻译或对话系统,输入和输出序列可以具有不同的长度。AutoModelForCausalLM用于生成式任务,如文本生成或自动文本摘要,其中输出序列的长度与输入序列的长度相同。
以下是使用AutoModelForSeq2SeqLM和AutoModelForCausalLM的简单示例:
# 序列到序列模型的示例
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("t5-base")
model = AutoModelForSeq2SeqLM.from_pretrained("t5-base")
input_text = "translate English to French: Hello, how are you?"
input_ids = tokenizer.encode(input_text, return_tensors="pt")
outputs = model.generate(input_ids=input_ids)
output_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(output_text)
# "Bonjour, comment ça va ?"
# 生成式模型的示例
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("gpt2")
model = AutoModelForCausalLM.from_pretrained("gpt2")
input_text = "The quick brown fox"
input_ids = tokenizer.encode(input_text, return_tensors="pt")
outputs = model.generate(input_ids=input_ids, max_length=50)
output_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(output_text)
# "The quick brown fox jumped over the lazy dog."