突然发现Huggingface有个功能挺不错,要是基于Transformers模块做推理,可以点击右上角User in Transformers生成基本代码。
自动生成初始化代码
然后自己写一下推理脚本:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
defAsk(text):
input_text = text;
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")# with torch.backends.cuda.sdp_kernel(enable_flash=True, enable_math=False, enable_mem_efficient=False):
generate_ids = model.generate(inputs.input_ids, max_length=100)print(tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0])
tokenizer = AutoTokenizer.from_pretrained("下下来模型的位置")### ERROR - transformers.tokenization_utils - Using pad_token, but it is not set yet
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained("下下来模型的位置").to("cuda")# convert the model to BetterTransformer
model.to_bettertransformer()