eval_PIC speed slow

Text + Sketch Compression via Text-to-Image Models

Implementation for Text + Sketch: Image Compression at Ultra Low Rates.

The following scripts will loop through a dataset, and then output the results (reconstructed images, sketches, and captions) into the recon_examples/ folder.

eval_PIC.py: uses prompt inversion to transmit a prompt and generate reconstructions.
eval_PICS.py: uses prompt inversion + sketch to transmit a compressed sketch and prompt.

For example, python eval_PICS.py --data_root DATA_ROOT will run PICS, where the images are contained in the DATA_ROOT folder. See scripts/PICS.sh for example usage. Prior to running this script, you will need to either (a) train the NTC sketch model or (b) download the pre-trained ones into the models_ntc/ folder. Instructions for both can be found below.

The annotator directory is taken from the ControlNet repo, and the prompt_inversion directory is based off of the Hard Prompts Made Easy repo.

Dataloaders

The dataloading assumes pytorch ImageFolder layouts inside DATA_ROOT. See dataloaders.py for more details.

Sketch NTC Models

A training script is provided in train_compressai.py, which is slightly modified from CompressAI's example training script. See scripts/train_sketch.sh example usage. To generate sketch training data, apply one of the filters in annotator/ to training images, and structure folder to fit the CompressAI ImageFolder.

Pre-trained NTC models for HED sketches, as well as HED sketches generated from CLIC2021 used to train it, can be found here. To download them onto a remote server, run

wget https://upenn.box.com/shared/static/g1fzf9ctn0qvdn9exjpp8mkqh7aja4gm -O trained_ntc_models.zip
wget https://upenn.box.com/shared/static/b90504o4k4onkicm8aal8fxkhltp2rnb -O HED_training_data.zip

Dependencies

pytorch
compressai
diffusers
pytorch-lightning
opencv-python
einops
ftfy
sentence-transformers
accelerate
xformers
basicsr

Notes

Since ControlNet was trained on uncompressed HED maps (the sketch), and not the decompressed ones, if the rate is set too low for the sketch, this can cause poor reconstructions for many image types.
In general, the Text + Sketch is better at reconstructing landscape photos compared to photos of objects. The performance is highly dependent on the pre-trained ControlNet model used (here we use SD), but any improved ControlNet model released in the future can be easily integrated into the Text + Sketch setup
Fine-tuning the models are currently in-progress

Citation

@inproceedings{lei2023text+sketch,
  title={Text+ Sketch: Image Compression at Ultra Low Rates},
  author={Lei, Eric and Uslu, Yi\u{g}it Berkay and Hassani, Hamed and Bidokhti, Shirin Saeedi},
  booktitle={ICML 2023 Workshop on Neural Compression: From Information Theory to Applications},
  year={2023}
}

leieric / text-sketch Goto Github PK

text-sketch's Introduction

Text + Sketch Compression via Text-to-Image Models

Dataloaders

Sketch NTC Models

Dependencies

Notes

Citation

text-sketch's People

Contributors

Stargazers

Watchers

Forkers

text-sketch's Issues

Recommend Projects

Recommend Topics

Recommend Org