Git Product home page Git Product logo

cae-admm's Introduction

CAE-ADMM: IMPLICIT BITRATE OPTIMIZATION VIA ADMM-BASED PRUNING IN COMPRESSIVE AUTOENCODERS

Haimeng Zhao, Peiyuan Liao

Abstract

We introduce ADMM-pruned Compressive AutoEncoder (CAE-ADMM) that uses Alternative Direction Method of Multipliers (ADMM) to optimize the trade-off between distortion and efficiency of lossy image compression. Specifically, ADMM in our method is to promote sparsity to implicitly optimize the bitrate, different from entropy estimators used in the previous research. The experiments on public datasets show that our method outperforms the original CAE and some traditional codecs in terms of SSIM/MS-SSIM metrics, at reasonable inference speed.

Paper & Citation

arXiv:1901.07196 [cs.CV]

If you use these models in your research, please cite:

@article{zhao2019cae,
  title={CAE-ADMM: Implicit Bitrate Optimization via ADMM-based Pruning in Compressive Autoencoders},
  author={Zhao, Haimeng and Liao, Peiyuan},
  journal={arXiv preprint arXiv:1901.07196},
  year={2019}
}

Model Architecture

Model Architecture

The architecture of CAE-ADMM. "Conv k/spP" stands for a convolutional layer with kernel size k times k with a stride of s and a reflection padding of P, and "Conv Down" is reducing the height and weight by 2.

Performance

Comparison of different method with respect to SSIM and MS-SSIM on the Kodak PhotoCD dataset. Note that Toderici et al. used RNN structure instead of entropy coding while CAE-ADMM (Ours) replaces entropy coding with pruning method.

Example

bpp 0.3 bpp 0.3

Comparison of latent code before and after pruning for kodim21. For the sake of clarity, we marked zero values in the feature map before normalization as black.

Acknowledgement

pytorch-msssim: Implementation of MS-SSIM in PyTorch is from pytorch-msssim

huffmancoding.py: Implementation of Huffman coding is from Deep-Compression-PyTorch

cae-admm's People

Contributors

haimengzhao avatar liaopeiyuan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

cae-admm's Issues

Reproduce the quality result

Hi,
I have trained this model following the settings in your paper (batch size 32, on BSDS dataset, 500 epochs, the lr decay etc), but I found I cannot obtain the same MS-SSIM result mentioned in your paper. Therefore, I used a subset of UCF101 dataset as the training set, which improves the performance. But still, the MS-SSIM result is not satisfying. For example, I got MS-SSIM 0.951 at about 0.44 bpp. As you have mentioned in your paper, models at different bit rates are obtained by fine tuning the final layer of the encoder, while I trained every model from scratch by modifying the numbers channels in the final layer of the encoder. I wonder this might cause a performance gap?

Another question in the compute_bpp function, I found that you used the theoretical lower bound of the entropy to represent the code length, which is a reasonable estimation. However, if we want to compare it with the traditional compression algorithm, like JPEG, which uses Huffman coding, I think we might need the real code length after Huffman coding to calculate bpp for a fair comparison.

Still another question about the PSNR result, which is not mentioned in your paper. In the paper lossy image compression with compressive autoencoders, the trained model can get a PSNR of 35 dB at 1 bpp. While my trained model can only get 30.6 dB at a similar bit rate. I think it is really a huge gap. It is true that the PSNR as an evaluation metric has its limitation, but it is still an important aspect to evaluate a compression algorithm. I wonder if you could share the PSNR result of your trained model? Because I have built and trained several image compression models, I found it is really hard to improve the PSNR result, and I really hope to know the reason.

Looking forward to your reply!
Gong

DATASET

what should I do if I want to use my own dataset ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.