Git Product home page Git Product logo

fashionflow's Introduction

FashionFlow: Leveraging Diffusion Models for Dynamic Fashion Video Synthesis from Static Imagery

This repository has the official code for 'FashionFlow: Leveraging Diffusion Models for Dynamic Fashion Video Synthesis from Static Imagery'. We have included the pre-trained checkpoint, dataset and results.

Abstract: Our study introduces a new image-to-video generator called FashionFlow to generate fashion videos. By utilising a diffusion model, we are able to create short videos from still fashion images. Our approach involves developing and connecting relevant components with the diffusion model, which results in the creation of high-fidelity videos that are aligned with the conditional image. The components include the use of pseudo-3D convolutional layers to generate videos efficiently. VAE and CLIP encoders capture vital characteristics from still images to condition the diffusion model at a global level. Our research demonstrates a successful synthesis of fashion videos featuring models posing from various angles, showcasing the fit and appearance of the garment. Our findings hold great promise for improving and enhancing the shopping experience for the online fashion industry.

Teaser

image

Requirements

  • Python 3.9
  • PyTorch 1.11+
  • Tensoboard
  • cv2
  • transformers
  • diffusers

Model Specification

The model was developed using PyTorch and loads pretrained weights for VAE and CLIP. The latent diffusion model consists of a 1D convolutional layer stacked against a 2D convolutional layer (forming a pseudo 3D convolution) and includes attention layers. See the model_structure.txt file to see the exact layers of our LDM.

Installation

Clone this repository:

git clone https://github.com/1702609/FashionFlow
cd ./FashionFlow/

Install PyTorch and other dependencies:

pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113
pip install -r requirements.txt

Dataset

Download the Fashion dataset by clicking on this link: [Fashion dataset]

Extract the files and place them in the fashion_dataset directory. The dataset should be organised as follows:

fashion_dataset
 test
  |-- 91-3003CN5S.mp4
  |-- 91BjuE6irxS.mp4
  |-- 91bxAN6BjAS.mp4
  |-- ...
 train
  |-- 81FyMPk-WIS.mp4
  |-- 91+bCFG1jOS.mp4
  |-- 91+PxmDyrgS.mp4
  |-- ...

Feel free to add your own dataset while following the provided file and folder structure.

Pre-trained Checkpoint

Download the checkpoint by clicking on this link: [Pre-trained checkpoints] Extract the files and place them in the checkpoint directory

Inference

To run the inference of our model, execute python inference.py. The results will be saved in the result directory.

Train

Before training, images and videos have to be projected to latent space for efficient training. Execute python project_latent_space.py where the tensors will be saved in the fashion_dataset_tensor directory.

Run python -m torch.distributed.launch --nproc_per_node=<number of GPUs> train.py to train the model. The checkpoints will be saved in the checkpoint directory periodically. Also, you can view the training progress using tensorboardX located in video_progress or find the generated .mp4 on training_sample.

Comparison

image

fashionflow's People

Contributors

1702609 avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.