Git Product home page Git Product logo

gan-transform-and-project's Introduction

GAN-Transform-and-Project

project page | paper

Transforming and Projecting Images into Class-conditional Generative Networks
Minyoung Huh   Richard Zhang   Jun-Yan Zhu   Sylvain Paris   Aaron Hertzmann
MIT CSAIL   Adobe Research
ECCV 2020 (oral)


Given a pre-trained BigGAN and a target image (left), our method uses gradient-free BasinCMA to transform the image and find a latent vector to closely reconstruct the image. Our method (top) can better fit the input image, compared to the baseline (bottom), which does not model image transformation and uses gradient-based ADAM optimization. Finding an accurate solution to the inversion problem allows us to further fine-tune the model weights to match the target image without losing downstream editing capabilities. For example, our method allows for changing the class of the object (top row), compared to the baseline (bottom).


Our method first searches for a transformation to apply to the input target image. We then solve for the latent vector that closely resembles the object in the target image, using our proposed optimization method, also referred to as ''projection''. The generative model can then be further fine-tuned to reconstruct the missing details that the original model could not generate. Finally, we can edit the image by altering the latent code or the class vector (e.g., changing the bus to a car).

Prerequisites

The code was developed on

  • Ubuntu 18.04
  • Python 3.7
  • PyTorch 1.2.0

We use the BigGAN PyTorch port from HuggingFaces github

Getting Started

  • Install the python dependencies
    First, install the correct PyTorch version for your machine and then install the remaining dependencies via

    pip install -r requirements.txt
  • Clone LPIPS

    The code uses perceptual similarity loss. We need to clone the LPIPS repo into this directory.

    git clone https://github.com/richzhang/PerceptualSimilarity

    The path should be in the following format ... or you can set the path manually in init_paths.py

    ./GAN-Transform-and-Project/PerceptualSimilarity
  • Download encoder weights (optional)

    The demo code uses encoder weights to speed up optimization.
    Download the weights from google drive and place it in the nets/weights sub-directory.

    The encoder weight should be in the following path

    ./GAN-Transform-and-Project/nets/weights/encoder.ckpt

Demo

We provide several demos for our project. If you do not wish to use the encoder, disable them appropriately.

  • Jupyter demo: A jupyter lab/notebook demo is available in example.ipynb

  • Command line: To run our code from command line

    CUDA_VISIBLE_DEVICES=$GPU_ID python demo.py --im=$PATH_TO_IMAGE

    To see all valid options run python demo.py --help

  • Streamlit interactive demo:
    We provide an interactive demo using Streamlit.

    First pip install streamlit and then run

    CUDA_VISIBLE_DEVICES=$GPU_ID streamlit run st_interactive.py
    

    Navigate to the posted ip address using your favorite browser.

Development

To get a glimpse of how to extend the work or invert images on your generative model. Take a look at demo.py

  • Optimization : We currently support GradientOptimizer, CMAOptimizer, BasinCMAOptimizer. We also have an experimental optimization methods using NeverGrad: NevergradOptimizer and NevergradHybridOptimizer. You can find the optimizers at optimizer.py.

  • Transformation : We support simple spatial affine transformation and various color transformations. A full list of transformation functions are available at transform_functions.py. The transformation optimization details are in transform.py.

Citation

If you found our work useful, please cite using

@inproceedings{huh2020ganprojection,
    title={Transforming and Projecting Images to Class-conditional Generative Networks},
    author={Huh, Minyoung and Zhang, Richard and Zhu, Jun-Yan
            and Paris, Sylvain and Hertzmann, Aaron},
    booktitle={ECCV},
    year={2020}
}

gan-transform-and-project's People

Contributors

animadversio avatar junyanz avatar minyoungg avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.