Git Product home page Git Product logo

dragdiffusion's Introduction

DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing

Yujun Shi    Chuhui Xue    Jun Hao Liew    Jiachun Pan   
Hanshu Yan    Wenqing Zhang    Vincent Y. F. Tan    Song Bai


arXiv page Twitter


Disclaimer

This is a research project, NOT a commercial product.

News and Update

  • [Oct 23rd] Code and data of DragBench are released! Please check README under "drag_bench_evaluation" for details.
  • [Oct 16th] Integrate FreeU when dragging generated image.
  • [Oct 3rd] Speeding up LoRA training when editing real images. (Now only around 20s on A100!)
  • [Sept 3rd] v0.1.0 Release.
    • Enable Dragging Diffusion-Generated Images.
    • Introducing a new guidance mechanism that greatly improve quality of dragging results. (Inspired by MasaCtrl)
    • Enable Dragging Images with arbitrary aspect ratio
    • Adding support for DPM++Solver (Generated Images)
  • [July 18th] v0.0.1 Release.
    • Integrate LoRA training into the User Interface. No need to use training script and everything can be conveniently done in UI!
    • Optimize User Interface layout.
    • Enable using better VAE for eyes and faces (See this)
  • [July 8th] v0.0.0 Release.
    • Implement Basic function of DragDiffusion

Installation

It is recommended to run our code on a Nvidia GPU with a linux system. We have not yet tested on other configurations. Currently, it requires around 14 GB GPU memory to run our method. We will continue to optimize memory efficiency

To install the required libraries, simply run the following command:

conda env create -f environment.yaml
conda activate dragdiff

Run DragDiffusion

To start with, in command line, run the following to start the gradio user interface:

python3 drag_ui.py

You may check our GIF above that demonstrate the usage of UI in a step-by-step manner.

Basically, it consists of the following steps:

Case 1: Dragging Input Real Images

1) train a LoRA

  • Drop our input image into the left-most box.
  • Input a prompt describing the image in the "prompt" field
  • Click the "Train LoRA" button to train a LoRA given the input image

2) do "drag" editing

  • Draw a mask in the left-most box to specify the editable areas.
  • Click handle and target points in the middle box. Also, you may reset all points by clicking "Undo point".
  • Click the "Run" button to run our algorithm. Edited results will be displayed in the right-most box.

Case 2: Dragging Diffusion-Generated Images

1) generate an image

  • Fill in the generation parameters (e.g., positive/negative prompt, parameters under Generation Config & FreeU Parameters).
  • Click "Generate Image".

2) do "drag" on the generated image

  • Draw a mask in the left-most box to specify the editable areas
  • Click handle points and target points in the middle box.
  • Click the "Run" button to run our algorithm. Edited results will be displayed in the right-most box.

License

Code related to the DragDiffusion algorithm is under Apache 2.0 license.

BibTeX

If you find our repo helpful, please consider leaving a star or cite our paper :)

@article{shi2023dragdiffusion,
  title={DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing},
  author={Shi, Yujun and Xue, Chuhui and Pan, Jiachun and Zhang, Wenqing and Tan, Vincent YF and Bai, Song},
  journal={arXiv preprint arXiv:2306.14435},
  year={2023}
}

Contact

For any questions on this project, please contact Yujun ([email protected])

Acknowledgement

This work is inspired by the amazing DragGAN. The lora training code is modified from an example of diffusers. Image samples are collected from unsplash, pexels, pixabay. Finally, a huge shout-out to all the amazing open source diffusion models and libraries.

Related Links

Common Issues and Solutions

  1. For users struggling in loading models from huggingface due to internet constraint, please 1) follow this links and download the model into the directory "local_pretrained_models"; 2) Run "drag_ui.py" and select the directory to your pretrained model in "Algorithm Parameters -> Base Model Config -> Diffusion Model Path".

dragdiffusion's People

Contributors

yujun-shi avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.