QSPARSE provides the open source implementation of the quantization and pruning methods proposed in Training Deep Neural Networks with Joint Quantization and Pruning of Weights and Activations. This library was developed to support and demonstrate strong performance among various experiments mentioned in our paper, including image classification, object detection, super resolution, and generative adversarial networks.
Full Precision | Joint Quantization 8bit and Pruning 50% |
---|---|
import torch.nn as nn
net = nn.Sequential(
nn.Conv2d(3, 32, 5),
nn.ReLU(),
nn.ConvTranspose2d(32, 3, 5, stride=2),
) |
import torch.nn as nn
from qsparse import prune, quantize
net = nn.Sequential(
quantize(bits=8), # input quantization
quantize(prune(nn.Conv2d(3, 32, 5), 0.5), 8), # weight pruning+quantization
prune(sparsity=0.5), # activation pruning
quantize(bits=8), # activation quantization
nn.ReLU(),
quantize(prune(nn.ConvTranspose2d(32, 3, 5, stride=2), 0.5), 8),
quantize(bits=8),
) |
It can be seen from the above snippet that our library provides a much simpler and more flexible software interface comparing to existing solutions, e.g. torch.nn.qat. More specifically, our library is layer-agnostic and can work with any PyTorch module as long as their parameters can be accessed from their weight
attribute, as is standard practice.
QSPARSE can be installed from PyPI:
pip install qsparse
Documentation can be accessed from Read the Docs.
Examples of applying QSPARSE to different tasks are provided at qsparse-examples.
The development environment can be setup as (Python >= 3.6 is required):
git clone https://github.com/mlzxy/qsparse
cd qsparse
make dependency
pre-commit install
Feel free to raise an issue if you have any questions.
If you find this open source release useful, please reference in your paper:
Zhang, X., Colbert, I., Kreutz-Delgado, K., & Das, S. (2021). Training Deep Neural Networks with Joint Quantization and Pruning of Weights and Activations. arXiv preprint arXiv:2110.08271.
@article{zhang2021training,
title={Training Deep Neural Networks with Joint Quantization and Pruning of Weights and Activations},
author={Zhang, Xinyu and Colbert, Ian and Kreutz-Delgado, Ken and Das, Srinjoy},
journal={arXiv preprint arXiv:2110.08271},
year={2021}
}