Git Product home page Git Product logo

comfyui-animateanyone-evolved's Introduction

ComfyUI-AnimateAnyone-Evolved

Improved AnimateAnyone implementation that allows you to use the opse image sequence and reference image to generate stylized video.
The current goal of this project is to achieve desired pose2video result with 1+FPS on GPUs that are equal to or better than RTX 3080!πŸš€

Test2Show-ChunLi.mp4

Currently Support

  • Please check example workflows for usage. You can use Test Inputs to generate the exactly same results that I showed here. (I got Chun-Li image from civitai)
  • Support different sampler & scheduler:
    • DDIM
      • 24 frames pose image sequences, steps=20, context_frames=24; Takes 835.67 seconds to generate on a RTX3080 GPU
        DDIM_context_frame_24.mp4
      • 24 frames pose image sequences, steps=20, context_frames=12; Takes 425.65 seconds to generate on a RTX3080 GPU

        DDIM_context_frame_12.mp4
    • DPM++ 2M Karras
      • 24 frames pose image sequences, steps=20, context_frames=12; Takes 407.48 seconds to generate on a RTX3080 GPU
        DPM++_2M_Karras_context_frame_12.mp4
    • LCM
      • 24 frames pose image sequences, steps=20, context_frames=24; Takes 606.56 seconds to generate on a RTX3080 GPU
        LCM_context_frame_24.mp4
      • Note:
        Pre-trained LCM Lora for SD1.5 does not working well here, since model is retrained for quite a long time steps from SD1.5 checkpoint, however retain a new lcm lora is feasible
    • Euler
      • 24 frames pose image sequences, steps=20, context_frames=12; Takes 450.66 seconds to generate on a RTX3080 GPU
        Euler_context_frame_12.mp4
    • Euler Ancestral
    • LMS
    • PNDM
  • Support add Lora
    • I did this for insert lcm lora
  • Support quite long pose image sequences
    • Tested on my RTX3080 GPU, can handle 120+ frames pose image sequences with context_frames=24
    • As long as system can fit all the pose image sequences inside a single tensor without GPU memory leak, then the main parameters will determine the GPU usage is context_frames, which does not correlate to the length of pose image sequences.
  • Current implementation is adopted from Moore-AnimateAnyone,
    • I tried to break it down into as many modules as possible, so the workflow in ComfyUI would closely resemble the original pipeline from AnimateAnyone paper:
      _Example_Workflow_Other_Imgs\AA_pipeline.png

Roadmap

  • Implement the compoents (Residual CFG) proposed in StreamDiffusion (Estimated speed up: 2X)
    • Result:
      Generated result is not good enough when using DDIM Scheduler togather with RCFG, even though it speed up the generating process by about 4X.
      In StreamDiffusion, RCFG works with LCM, could also be the case here, so keep it in another branch for now.
  • Incorporate the implementation & Pre-trained Models from Open-AnimateAnyone & AnimateAnyone once they released
  • Convert Model using stable-fast (Estimated speed up: 2X)
  • Train a LCM Lora for denoise unet (Estimated speed up: 5X)
  • Training a new Model using better dataset to improve results quality (Optional, we'll see if there is any need for me to do it ;)
  • Continuous research, always moving towards something better & fasterπŸš€

Install (You can also use ComfyUI Manager)

  1. Clone this repo into the Your ComfyUI root directory\ComfyUI\custom_nodes\ and install dependent Python packages:
    cd Your_ComfyUI_root_directory\ComfyUI\custom_nodes\
    
    git clone https://github.com/MrForExample/ComfyUI-AnimateAnyone-Evolved.git
    
    pip install -r requirements.txt
    
    # If you got error regards diffusers then run:
    pip install --force-reinstall diffusers>=0.26.1
  2. Download pre-trained models:
    ./pretrained_weights/
    |-- denoising_unet.pth
    |-- motion_module.pth
    |-- pose_guider.pth
    |-- reference_unet.pth
    `-- stable-diffusion-v1-5
        |-- feature_extractor
        |   `-- preprocessor_config.json
        |-- model_index.json
        |-- unet
        |   |-- config.json
        |   `-- diffusion_pytorch_model.bin
        `-- v1-inference.yaml
    
    • Download clip image encoder (e.g. sd-image-variations-diffusers ) and put it under Your_ComfyUI_root_directory\ComfyUI\models\clip_vision
    • Download vae (e.g. sd-vae-ft-mse) and put it under Your_ComfyUI_root_directory\ComfyUI\models\vae

comfyui-animateanyone-evolved's People

Contributors

mrforexample avatar ctankep avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.