Lossy Image Compression using Hierarchical VAEs

This repository contains authors' implementation of several deep hierarchical VAE-based methods related to lossy image compression.
Code is under active development, and the API is subject to change.

Features

Progressive coding: our models learn a deep hierarchy of latent variables and compress/decompress images in a coarse-to-fine fashion. This feature comes from the hierarchical nature of ResNet VAEs.

Compression efficiency: our models are powerful in terms of both rate-distortion (bpp-PSNR) and decoding speed.

Model Name	CPU* Enc.	CPU* Dec.	1080 ti Enc.	1080 ti Dec.	BD-rate*
`qres34m`	0.899s	0.441s	0.213s	0.120s	-3.95
`qarv_base`	0.880s	0.295s	0.211s	0.096s	-6.54

*Time is the latency to encode/decode a 512x768 image, averaged over 24 Kodak images. Tested in plain PyTorch (v1.13 + CUDA 11.7) code, ie, no mixed-precision, torchscript, ONNX/TensorRT, etc.
*CPU is Intel 10700k.
*BD-rate is w.r.t. VTM 18.0, averaged on three common test sets (Kodak, Tecnick TESTIMAGES, and CLIC 2022 test set).

Implemented Methods - Pre-Trained Models Available

Lossy Image Compression with Quantized Hierarchical VAEs [arXiv]
- Published at WACV 2023, Best Algorithms Paper Award
- [Abstract]: a 12-layer VAE model named QRes-VAE. Good compression performance.
- [Code]: lossy-vae/lvae/models/qres
QARV: Quantization-Aware ResNet VAE for Lossy Image Compression [arXiv]
- Technical report
- [Abstract]: improved version of QRes-VAE. Variable-rate, faster decoding, better performance.
- [Code]: lossy-vae/lvae/models/qarv
An Improved Upper Bound on the Rate-Distortion Function of Images
- [Abstract]: a 15-layer VAE model used to estimate the information R(D) function. Shows that -30% BD-rate w.r.t. VTM is theoretically achievable.
- [Code]: lossy-vae/lvae/models/rd

Install

Requirements:

Python
PyTorch >= 1.9 : https://pytorch.org/get-started/locally
tqdm : conda install tqdm
CompressAI : https://github.com/InterDigitalInc/CompressAI
timm >= 0.8.0 : https://github.com/huggingface/pytorch-image-models

Download and Install:

Download the repository;
Modify the dataset paths in lossy-vae/lvae/paths.py.
[Optional] pip install the repository in development mode:

cd /pasth/to/lossy-vae
python -m pip install -e .

Usage

Detailed usage is provided in each model's folder

Prepare Datasets for Training and Evaluation

COCO

Download the COCO dataset "2017 Train images [118K/18GB]" from https://cocodataset.org/#download
Unzip the images anywhere, e.g., at /path/to/datasets/coco/train2017
Edit lossy-vae/lvae/paths.py such that

known_datasets['coco-train2017'] = '/path/to/datasets/coco/train2017'

Kodak (link), Tecnick TESTIMAGES (link), and CLIC (link)

python scripts/download-dataset.py --name kodak         --datasets_root /path/to/datasets
                                          clic2022-test
                                          tecnick

Then, edit lossy-vae/lvae/paths.py such that known_datasets['kodak'] = '/path/to/datasets/kodak', and similarly for other datasets.

License

TBD

baoyu2020 / lossy-vae Goto Github PK

lossy-vae's Introduction

Lossy Image Compression using Hierarchical VAEs

Features

Implemented Methods - Pre-Trained Models Available

Install

Usage

Prepare Datasets for Training and Evaluation

License

lossy-vae's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent