Git Product home page Git Product logo

Comments (12)

ruizhaocv avatar ruizhaocv commented on May 27, 2024

Could you please provide your running commend and results here? Then we can go through them and find the problem.

from motiondirector.

XiaominLi1997 avatar XiaominLi1997 commented on May 27, 2024

I also meet the same problem.

  1. given:
    Prompt: A person is riding a bicycle past the Eiffel Tower.
    seed: 2023
    ckpt: ./outputs/train/train_2023-12-02T13-39-36/ (https://huggingface.co/Yhyu13/MotionDirector_LoRA)
    I got the following result, no person exits in the video.
    https://github.com/showlab/MotionDirector/assets/25433111/9e1903e3-d13d-4dfa-a774-9b45d55d364d

  2. given:
    Prompt: A person is riding a bicycle past the Eiffel Tower.
    seed: 7192280
    ckpt: ./outputs/train/train_2023-12-02T13-39-36/ (https://huggingface.co/Yhyu13/MotionDirector_LoRA)
    I got the following results, which is unclear.
    https://github.com/showlab/MotionDirector/assets/25433111/e2728118-33d1-4aa3-9e8b-9d6ff9b7a66d

from motiondirector.

ruizhaocv avatar ruizhaocv commented on May 27, 2024

Hi Xiaomin. Thanks for the feedback. How about other checkpoints? Like (https://github.com/showlab/MotionDirector#motiondirector-trained-on-a-single-video). Generally, setting the same seed as listed in the readme will generate the same result as shown.

from motiondirector.

XiaominLi1997 avatar XiaominLi1997 commented on May 27, 2024

Hi Xiaomin. Thanks for the feedback. How about other checkpoints? Like (https://github.com/showlab/MotionDirector#motiondirector-trained-on-a-single-video). Generally, setting the same seed as listed in the readme will generate the same result as shown.

Yep, results from training on a single video are the same. thx

from motiondirector.

ruizhaocv avatar ruizhaocv commented on May 27, 2024

Nice. Maybe I confused the checkpoints of the riding bicycle. Will check that.

from motiondirector.

XiaominLi1997 avatar XiaominLi1997 commented on May 27, 2024

Nice. Maybe I confused the checkpoints of the riding bicycle. Will check that.

hi, I found a new problem, a train on a single video (prompt: A person is skateboarding)

Given the same seed=6668889 and prompt="A panda is skateboarding." during both training and inference.

  1. sample a video during training with ckpt-300
    the result is pretty good.
    https://github.com/showlab/MotionDirector/assets/25433111/a35240a6-4b65-41d8-906c-b15f1f300741

however,
2. sample a video during inference with ckpt-300
the performance is bad.

A_panda_is_skateboarding_6668889.mp4

Could you please check the inference code or check the reason (maybe the hyper-parameters)? I and my co-worker met the same problem.

Inference hyper-parameters I used:
"args": [ "--model", "/15764332239/pretrained_models/text-to-video-ms-1.7b", "--prompt", "A panda is skateboarding.", "--checkpoint_folder", "./outputs/train/skateboard-single-video", "--checkpoint_index", "300", "--noise_prior", "0.5", "--seed", "6668889" ],

training hyper-params:
`pretrained_model_path: "/15764332239/pretrained_models/text-to-video-ms-1.7b"

output_dir: "./outputs/train"

dataset_types:

  • 'single_video'

cache_latents: True

cached_latent_dir: null

use_unet_lora: True

lora_unet_dropout: 0.1

save_pretrained_model: False
lora_rank: 32

train_data:

width: 384
height: 384

use_bucketing: True

sample_start_idx: 1
fps: 8

frame_step: 1

n_sample_frames: 16

single_video_path: "./test_data/skateboarding-front/708-75070.avi"

single_video_prompt: "A person is skateboarding."

validation_data:

prompt:

  • "A panda is skateboarding."
  • "A tiger is skateboarding."

sample_preview: True

num_frames: 16

width: 384
height: 384

num_inference_steps: 30

guidance_scale: 12

spatial_scale: 0

noise_prior: 0.5

learning_rate: 5e-4

adam_weight_decay: 1e-2

max_train_steps: 300

checkpointing_steps: 50

validation_steps: 50

seed: 6668889

mixed_precision: "fp16"

gradient_checkpointing: False
text_encoder_gradient_checkpointing: False

enable_xformers_memory_efficient_attention: True

enable_torch_2_attn: True`

from motiondirector.

ruizhaocv avatar ruizhaocv commented on May 27, 2024

How about inference with checkpoint_index=150 look like?

from motiondirector.

XiaominLi1997 avatar XiaominLi1997 commented on May 27, 2024

How about inference with checkpoint_index=150 look like?

So good! Why does this phenomenon occur?

A_panda_is_skateboarding_6668889.mp4

Sample during training with checkpoint_index=150:

150_A.panda.is.skateboarding.mp4

above two results are different.

from motiondirector.

ruizhaocv avatar ruizhaocv commented on May 27, 2024

For faster convergence, we set a large learning rate, which may cause instability in the late training steps. If you want a more stable but slower training, you can try to reduce the learning rate. Enjoy exploring the optimal hyperparameters for your own training task.

from motiondirector.

ruizhaocv avatar ruizhaocv commented on May 27, 2024

Setting the seed fixed for inference will make sure to generate the same results. However, setting the same random seed does not mean you will get the exactly same results in the inference stage and training stage. Because every time the seed is called in the training stage, it will change.

from motiondirector.

XiaominLi1997 avatar XiaominLi1997 commented on May 27, 2024

Setting the seed fixed for inference will make sure to generate the same results. However, setting the same random seed does not mean you will get the exactly same results in the inference stage and training stage. Because every time the seed is called in the training stage, it will change.

Thanks, I just mistakenly thought the seed below as the validation seed. Actually, it used in training.

image

Thanks again for your nice reply.

from motiondirector.

ruizhaocv avatar ruizhaocv commented on May 27, 2024

Thanks for pointing this out. I have deleted this confusing item.

from motiondirector.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.