By Shigeng Wang, Chao Li, Yangyuxuan Kang, Jiawei Fan, Zhonghong Ou and Anbang Yao.
This repository is the official PyTorch implementation of "SliderQuant: Accurate Post-Training Quantization for LLMs", accepted to ICLR 2026.
SliderQuant (Sliding-layer Quantization) is a new learnable post-training quantization framework for LLMs, which consists of two key components:
- Inter-layer sliding quantization couples three types of sliding window designs to address the varying quantization sensitivity of shallow, intermediate and deep layers of any pre-trained LLMs.
- Intra-layer sliding quantization quantizes layers inside the current slidning window in an incremental manner.
The following checkpoints are planned for public release on Hugging Face:
| Model | Quantization | Hugging Face |
|---|---|---|
| Llama2-13B | W4A4 | SliderQuant-Llama2-13B-W4A4 |
| Llama2-13B | W2A16 | SliderQuant-Llama2-13B-W2A16 |
| Qwen2.5-14B | W4A4 | SliderQuant-Qwen2.5-14B-W4A4 |
| Qwen2.5-14B | W2A16 | SliderQuant-Qwen2.5-14B-W2A16 |
All checkpoints are available under IntelLabsChina/SliderQuant.
git clone https://github.com/genggng/sliderquant
mamba create -n sliderquant python=3.10 -y
mamba activate sliderquant
cd sliderquant
pip install -e .
- Create a folder and place the experimental configuration file inside, following this structure:
sliderquant/
├── log-llama2
│ └── llama2-w4a4
│ └── config.yaml
- Edit
task_list.confto specify theresult_dir.
result_dir=configs/llama2-7b-w2a16
result_dir=${exp_id}
GPU_NUM=1
port=29507
THRESHOLD=0.05
WAIT_MODE=true
WAIT_INTERVAL=60- Start training:
./auto_train_ddp.sh- Edit
task_list.confto specify theresult_dir.
result_dir=configs/llama2-7b-w2a16
GPU_NUM=1
port=29507
THRESHOLD=0.05
WAIT_MODE=true
WAIT_INTERVAL=60- Run evaluation:
./auto_test_one.shIf SliderQuant is useful in your research, please cite:
@inproceedings{wang2026sliderquant,
title={SliderQuant: Accurate Post-Training Quantization for LLMs},
author={Wang, Shigeng and Li, Chao and Kang, Yangyuxuan and Fan, Jiawei and Ou, Zhonghong and Yao, Anbang},
booktitle={International Conference on Learning Representations},
year={2026}
}SliderQuant builds code from:
We are grateful to the authors and maintainers of both projects for making their amazing code public.





