qwen-1.5运行和awq量化qwen大模型

qwen-1.5运行和awq量化qwen大模型

码农世界 2024-05-23 前端 86 次浏览 0个评论

1.下载qwen-1.5:

GitHub - QwenLM/Qwen1.5: Qwen1.5 is the improved version of Qwen, the large language model series developed by Qwen team, Alibaba Cloud.

2.下载模型:

https://huggingface.co/Qwen

3. 安装各种库,tansformers....等等,注意版本;

4.写下自己的run脚本,下面举例:(模型地址记得要写全路径)

from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto
model = AutoModelForCausalLM.from_pretrained(
    "Qwen1.5-7B-Chat",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("Qwen1.5-7B-Chat")
prompt = "Give me a short introduction to large language model."
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)
generated_ids = model.generate(
    model_inputs.input_ids,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

5.下载autoawq:

https://github.com/casper-hansen/AutoAWQ

6.源码安装:

git clone https://github.com/casper-hansen/AutoAWQ

cd AutoAWQ

pip install -e .

7.运行下面脚本,记得路径要写全路径:

rom awq import AutoAWQForCausalLM
from transformers import AutoTokenizer
model_path = 'Qwen1.5-7B-Chat'
quant_path = 'Qwen1.5-7B-Chat-AWQ-MALI'
quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" }
# Load model
model = AutoAWQForCausalLM.from_pretrained(
    model_path, **{"low_cpu_mem_usage": True, "use_cache": False}
)
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
# Quantize
model.quantize(tokenizer, quant_config=quant_config)
# Save quantized model
model.save_quantized(quant_path)
tokenizer.save_pretrained(quant_path)
print(f'Model is quantized and saved at "{quant_path}"')

8.把新生成的模型替换为第一个代码的,执行

转载请注明来自码农世界,本文标题:《qwen-1.5运行和awq量化qwen大模型》

百度分享代码,如果开启HTTPS请参考李洋个人博客
每一天,每一秒,你所做的决定都会改变你的人生!

发表评论

快捷回复:

评论列表 (暂无评论,86人围观)参与讨论

还没有评论,来说两句吧...

Top