This repo contains a jupyter notebook that will utilize the GPTQ technique to quantize LLMs. An in-depth explanation combined with examples is included in the notebook which you can follow to quantize any of the LLMs. For simplicity purposes, I have quantized an open-source language model from huggingface called dlite-v2-355m.
SujanNeupane42/LLM_Quantization
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|