GAN-Transform-and-Project
project page | paper
Transforming and Projecting Images into Class-conditional Generative Networks
Minyoung Huh Richard Zhang Jun-Yan Zhu Sylvain Paris Aaron Hertzmann
MIT CSAIL Adobe Research
ECCV 2020 (oral)
Given a pre-trained BigGAN and a target image (left), our method uses gradient-free BasinCMA to transform the image and find a latent vector to closely reconstruct the image. Our method (top) can better fit the input image, compared to the baseline (bottom), which does not model image transformation and uses gradient-based ADAM optimization. Finding an accurate solution to the inversion problem allows us to further fine-tune the model weights to match the target image without losing downstream editing capabilities. For example, our method allows for changing the class of the object (top row), compared to the baseline (bottom).
Our method first searches for a transformation to apply to the input target image. We then solve for the latent vector that closely resembles the object in the target image, using our proposed optimization method, also referred to as ''projection''. The generative model can then be further fine-tuned to reconstruct the missing details that the original model could not generate. Finally, we can edit the image by altering the latent code or the class vector (e.g., changing the bus to a car).
Prerequisites
The code was developed on
- Ubuntu 18.04
- Python 3.7
- PyTorch 1.2.0
We use the BigGAN PyTorch port from HuggingFaces github
Getting Started
-
Install the python dependencies
First, install the correct PyTorch version for your machine and then install the remaining dependencies viapip install -r requirements.txt
-
Clone LPIPS
The code uses perceptual similarity loss. We need to clone the LPIPS repo into this directory.
git clone https://github.com/richzhang/PerceptualSimilarity
The path should be in the following format ... or you can set the path manually in
init_paths.py
./GAN-Transform-and-Project/PerceptualSimilarity
-
Download encoder weights (optional)
The demo code uses encoder weights to speed up optimization.
Download the weights from google drive and place it in thenets/weights
sub-directory.The encoder weight should be in the following path
./GAN-Transform-and-Project/nets/weights/encoder.ckpt
Demo
We provide several demos for our project. If you do not wish to use the encoder, disable them appropriately.
-
Jupyter demo: A jupyter lab/notebook demo is available in example.ipynb
-
Command line: To run our code from command line
CUDA_VISIBLE_DEVICES=$GPU_ID python demo.py --im=$PATH_TO_IMAGE
To see all valid options run
python demo.py --help
-
Streamlit interactive demo:
We provide an interactive demo using Streamlit.First
pip install streamlit
and then runCUDA_VISIBLE_DEVICES=$GPU_ID streamlit run st_interactive.py
Navigate to the posted
ip address
using your favorite browser.
Development
To get a glimpse of how to extend the work or invert images on your generative model. Take a look at demo.py
-
Optimization : We currently support
GradientOptimizer
,CMAOptimizer
,BasinCMAOptimizer
. We also have an experimental optimization methods using NeverGrad:NevergradOptimizer
andNevergradHybridOptimizer
. You can find the optimizers at optimizer.py. -
Transformation : We support simple spatial affine transformation and various color transformations. A full list of transformation functions are available at transform_functions.py. The transformation optimization details are in transform.py.
Citation
If you found our work useful, please cite using
@inproceedings{huh2020ganprojection,
title={Transforming and Projecting Images to Class-conditional Generative Networks},
author={Huh, Minyoung and Zhang, Richard and Zhu, Jun-Yan
and Paris, Sylvain and Hertzmann, Aaron},
booktitle={ECCV},
year={2020}
}