openmotionlab / motiongpt Goto Github PK

View Code? Open in Web Editor NEW

1.2K 1.2K 74.0 8.75 MB

[NeurIPS 2023] MotionGPT: Human Motion as a Foreign Language, a unified motion-language generation model using LLMs

Home Page: https://motion-gpt.github.io

License: MIT License

Python 98.52% CSS 0.97% Shell 0.51%

3d-generation chatgpt gpt language-model motion motion-generation motiongpt multi-modal text-driven text-to-motion

motiongpt's People

Contributors

Stargazers

Watchers

motiongpt's Issues

Video can be generated, but it cannot be displayed in the visualization of the graphics page

Questions about testing results

Thank you for your great job! I have tried to reproduce the results and encountered some issues.

Following instructions, I evaluate the provided checkpoint downloaded from huggingface.

I run the following commands:

python -m test --cfg configs/config_h3d_stage3.yaml --task t2m
python -m test --cfg configs/config_h3d_stage3.yaml --task m2t

The evaluation results are not consistent with the results reported in the paper. The attachments are the log and metrics.

t2m results:

log_2023-10-04-19-56-23_test.log

Would you happen to have any idea about what's wrong with the configuration?

Stage1 training crashes on eval - continued

hi,
thanks for a very interesting paper and supporting code.
I'm trying to run training, but fail.

I tried to create the dataset few times, but it didn't help.
dataset preparation looks ok, here are the results:

Questions on fps

Hi. Thank you for the great work. I am still confused on how fps affects the model performance. I see that the motion dataset used to train on is in 20fps. Does this work well if you have lower fps (say 15 fps) or higher fps motion (say 60 fps)?

No module named natsort while render mesh

Hi author,
when I render mesh, I meet the following error:
No module named natsort.
Appreciate any help. Thanks

question on mT5 pretrain

def create_sentinel_ids(self, mask_indices):
        # From https://github.com/huggingface/transformers/blob/main/examples/flax/language-modeling/run_t5_mlm_flax.py
        start_indices = mask_indices - np.roll(mask_indices, 1,
                                               axis=-1) * mask_indices
        start_indices[:, 0] = mask_indices[:, 0]
        sentinel_ids = np.where(start_indices != 0,
                                np.cumsum(start_indices, axis=-1),
                                start_indices)
        sentinel_ids = np.where(sentinel_ids != 0,
                                (len(self.tokenizer) - sentinel_ids), 0)
        sentinel_ids -= mask_indices - start_indices
        return sentinel_ids

In the code, you replace mask with sentinel_ids, the position of which is at last of tokenizer。But before doing this，you had add motion token to the last of tokenzie，Was this done on purpose?

The rendered image is upside down for custom prompt

Hi Author,
while using custom prompt results to render video,
The rendered image is upside down for custom prompt

Any suggestion is appreciated, thanks

About the Motion-Language Pre-training Stage?

Are there some mistakes when choicing a condition?

'assets/css/custom.css' not exists in app.py

hi author,
thanks for your excellent code.
When I run app.py, i meet the error that 'assets/css/custom.css' not exists.
Pls update this file, thank you so much!

Will the code be released?

I can see only PDF but not code on Github.

blender: error: unrecognized arguments: --joint_type=HumanML3D

HI author,
It seems no joint_type parameter here.

Help me run python app.py

Hi, I am running the gradio demo using python app.py. now I am encountering this error.
please help me how to fix.

Global seed set to 1234
Traceback (most recent call last):
File "/Users/namhuiju/opt/anaconda3/envs/mgpt/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 261, in hf_raise_for_status
response.raise_for_status()
File "/Users/namhuiju/opt/anaconda3/envs/mgpt/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/deps/whisper-large-v2/resolve/main/preprocessor_config.json

AttributeError: 'Joints' object has no attribute 'joinst

Hi Author,
I am a newer to human motion.
Hopefully, you can give me some suggestions.
While rendering by custom prompts, I got this issue.

AttributeError: 'Joints' object has no attribute 'joinst

    if jointstype == "mmm":
        self.kinematic_tree = mmm_kinematic_tree
        self.joints = mmm_joints
        self.joinst.append("")                                                     // the error happens here
    elif jointstype == "humanml3d":
        self.kinematic_tree = humanml3d_kinematic_tree
        self.joints = humanml3d_joints

How should I solve this issue?
Thanks

ValueError when running demo.py and app.py

Hi there, I'm trying to get the demo up and running, but encounter the following error after following the provided instructions and adding any missing files.

Global seed set to 1234
Traceback (most recent call last):
File "/home/msegado/MotionGPT/app.py", line 31, in
datamodule = build_data(cfg, phase="test")
File "/home/msegado/MotionGPT/mGPT/data/build_data.py", line 10, in build_data
return instantiate_from_config(data_config)
File "/home/msegado/MotionGPT/mGPT/config.py", line 42, in instantiate_from_config
return get_obj_from_str(config["target"])(**config.get("params", dict()))
File "/home/msegado/MotionGPT/mGPT/data/HumanML3D.py", line 76, in init
self._sample_set = self.get_sample_set(overrides={"split": "test", "tiny": True})
File "/home/msegado/MotionGPT/mGPT/data/init.py", line 20, in get_sample_set
return self.DatasetEval(**sample_params)
File "/home/msegado/MotionGPT/mGPT/data/humanml/dataset_t2m_eval.py", line 24, in init
super().init(data_root, split, mean, std, max_motion_length,
File "/home/msegado/MotionGPT/mGPT/data/humanml/dataset_t2m.py", line 152, in init
name_list, length_list = zip(
ValueError: not enough values to unpack (expected 2, got 0)

I get the same error when running both "python demo.py --cfg ./configs/config_h3d_stage3.yaml --example ./demos/t2m.txt" and "python app.py"

Any suggestions? Thanks!

ValueError: not enough values to unpack (expected 2, got 0)

Hi, I got this issue even after unzip the texts.zip file.
Traceback (most recent call last):
File "/home/zzmarybloody/MotionGPT/demo.py", line 237, in
main()
File "/home/zzmarybloody/MotionGPT/demo.py", line 147, in main
datamodule = build_data(cfg)
File "/home/zzmarybloody/MotionGPT/mGPT/data/build_data.py", line 10, in build_data
return instantiate_from_config(data_config)
File "/home/zzmarybloody/MotionGPT/mGPT/config.py", line 42, in instantiate_from_config
return get_obj_from_str(config["target"])(**config.get("params", dict()))
File "/home/zzmarybloody/MotionGPT/mGPT/data/HumanML3D.py", line 76, in init
self._sample_set = self.get_sample_set(overrides={"split": "test", "tiny": True})
File "/home/zzmarybloody/MotionGPT/mGPT/data/init.py", line 20, in get_sample_set
return self.DatasetEval(**sample_params)
File "/home/zzmarybloody/MotionGPT/mGPT/data/humanml/dataset_t2m_eval.py", line 24, in init
super().init(data_root, split, mean, std, max_motion_length,
File "/home/zzmarybloody/MotionGPT/mGPT/data/humanml/dataset_t2m.py", line 165, in init
name_list, length_list = zip(
ValueError: not enough values to unpack (expected 2, got 0)

The number of GPUs

Hi, I see the paper that you use 8 gpus in the main paper, but also state 64 gpus in the appendix. So what is the number of used gpus during training?

Help with motion tokens and motion files

Hi. I have a couple questions regarding how motion tokens are fed in in inference and training. I have an array of SMPL parameters (pose, beta, etc.).

Do I have to convert it into a .ply file of a video like in the demo and it takes in that format only? Can it take in raw arrays or other format files? I don't have access to Blender so I can't use Blender to generate these files.
Does the motion tokens have to fit in a shared environment space? Meaning if I have 2 different motion files for "a person running", do they have to be exact motion tokens or can they be translated a bit (aka different x/y coordinates)?

IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

Hi author,
when I render mesh in the last step of visualization, it shows the following error.
Any help is appreciated here. Thanks

About the pre-training stage

Where can I find the corresponding codes about the "input mixed tokens"? Thank you.

Can't find the paper

The arxiv link points to "Executing your Commands via Motion Diffusion in Latent Space".

Where the token to enter T5 comes from?

Thank you for bringing such interesting work. When reading the paper, I had a confusion. In Figure 2, token from vqvae's codebook (yellow token input to T5). But in Sec. 3.2, you said "we combine the original text vocabulary V_{t} with motion vocabulary V_{m}, which is order-preserving to our motion codebook Z." Does this mean that the token entered into T5 does not come from the codebook of VQVAE (V_{m} is different from Z).

The blender rendered image seems upside down

Dear authors,

Thank you for the great work!
The rendered image from the blender seems upside down:

Do you know how to solve this?

Thank you!

Best,
Rui

TypeError: Audio.init() got an unexpected keyword argument 'source'

Traceback (most recent call last):
File "/root/autodl-tmp/MotionGPT-main/app.py", line 512, in
aud = gr.Audio(source="microphone",
File "/root/miniconda3/envs/mgpt/lib/python3.10/site-packages/gradio/component_meta.py", line 146, in wrapper
return fn(self, **kwargs)
TypeError: Audio.init() got an unexpected keyword argument 'source'
Hello, I want to know how to solve this problem. Thanks!

gradio app has error: "ValueError: Need to enable queue to use generator."

Hi Author,
While running app.py, I input prompt, get the following error.
Any suggestion is appreciated. Thanks

Traceback (most recent call last):
File "/opt/conda/envs/mgpt/lib/python3.10/site-packages/gradio/routes.py", line 508, in predict
output = await route_utils.call_process_api(
File "/opt/conda/envs/mgpt/lib/python3.10/site-packages/gradio/route_utils.py", line 219, in call_process_api
output = await app.get_blocks().process_api(
File "/opt/conda/envs/mgpt/lib/python3.10/site-packages/gradio/blocks.py", line 1437, in process_api
result = await self.call_function(
File "/opt/conda/envs/mgpt/lib/python3.10/site-packages/gradio/blocks.py", line 1117, in call_function
raise ValueError("Need to enable queue to use generators.")
ValueError: Need to enable queue to use generators.

smpl to mixamo rig

hi,
can you explain how to translate from the smpl output to mixamo rig?
I tried but got strange animations.

No such file or directory: 'datasets/humanml3d/test.txt'

when running app.py

demo is not ready

Hi Author,
Thanks for your excellent work.
It seems the demo is not ready. I meet some issues such as the parameter being missing.
Hopefully, it will be resolved recently. Thanks

Questions about results reproduction

Hello, thank you for releasing this amazing work.
I would like to reproduce the results on the humandml3d dataset and have some questions about it:

About the hyperparameters, in the paper you use the number of iterations, while the configs in the code use END_EPOCH. Do they mean the same thing for you, or is it that nb_iterations = END_EPOCH * nb_batches ?

I also have a question about the real time application of the method; is it possible to use the method for motion captioning in real time, for example a motion is being played and the description of motion is quite in sync as the motion is being generated, so there is no big lag between the two?

Unauthorized for url: https://huggingface.co/deps/whisper-large-v2

requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/deps/whisper-large-v2/resolve/main/preprocessor_config.json

Are you familiar with the other MotionGPT?

Another MotionGPT paper was preprinted earlier this week and it'd be helpful to explicitly contrast the two!

About the 'get_motion_code.py'

Do you forget to load the pretrained weights for VQ-VAE when generating the tokens?

Why T5 is used instead of GPT?

It seems GPT like llama2 is more popular.
But the paper still use T5.
Compared to GPT, does it have any special advantages to use T5?

Motion tokens

Hello. I read through the GitHub website and had a couple questions:

how are you getting the motion tokens in the first place? What 3D model is being used and does it know what joints are what?
how do you feed the tokens into GPT? I assume motion tokens are a lot so how does this work considering the limited context length?

Stage1 training crashes on eval

Command line used:
python -m train --cfg configs/config_h3d_stage1.yaml --nodebug

Error:
torch._C._LinAlgError: linalg.svd: (Batch element 0): The algorithm failed to converge because the input matrix contained non-finite values.

run demo.py meet parameter missing(issue #17 additional details)

Thanks for excellent work!
I have reviewed issue #17 and noticed that others have faced similar problems.
Issue Description:
In the context you've mentioned, running demo.py requires certain parameters like 'render' and 'frame_rate,' but I couldn't locate them in the parameter table. I'm unsure about how to resolve this issue and would appreciate your assistance.Thanks.

FileNotFoundError: [Errno 2] No such file or directory:

python demo.py --cfg ./configs/config_h3d_stage3.yaml --example ./demos/t2m.txt

FileNotFoundError: [Errno 2] No such file or directory: 'deps/t2m/t2m\VQVAEV3_CB1024_CMT_H1024_NRES3\meta\mean.npy'

But, I have this file.

So, my file path is t2m folder is repeated 3 times

I followed the Quick Start guideline exactly, but is it just me?

Why 404 of hugging face demo page

train pycharm debug crash

I'm trying to run train train with params "--cfg configs/config_h3d_stage1.yaml --nodebug" in pycharm in order to debug why its not working but i'm getting "Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)"

What exactly is raw motion data?

I am reading through the paper but I am confused on what you mean raw motion data. It does not seem to be clarified anywhere. Is this full 3D meshes or joint keypoints or what?

When will the code be released? As well as ability to fine tune?

Hi. Thanks for the great work and for making this public. What is the expected timeline for the code as well as instructions on how to fine tune?

Error when rendering as a 3D human ("slow" visualisation) in Gradio demo

Hi, so I am running the gradio demo using python app.py , it works fine with the "fast" visualisation mode (i.e. skeletal keypoints). But whenever I am changing it to the "slow" visualisation mode to see the full human rendering, I am encountering this peculiar error (it's a long error msg but this is the last line -- if you need the full error message I can provide you that):

raise NoSuchDisplayException(f'Cannot connect to "{name}"')
pyglet.canvas.xlib.NoSuchDisplayException: Cannot connect to "None"

Not understanding what is happening. I searched stackoverflow, some posts said it was a vscode issue while opening a new window, but here all rendering is happening in the localhost server, so what's the problem here?

If anyone has faced this issue and knows a workaround, please let me know. Thanks in advance!

VQ-VAE or VQ-VAE-2？

great work！I would like to ask whether the technology used in the Motion Tokenizer used in the paper is VQ-VAE or VQ-VEVA-2? It looks like VQ-VAE, why not VQ-VVE-2? In addition, I do not understand the conversion process from codebooks to Motiontokens, can you answer it？ looking forward to your reply.

How to finetune?

Do you have to re-train the whole model with extra data? Or is there an easy way to fine-tune?

T5ForConditionalGeneration.from_pretrained HeaderTooLarge

when running app.py

I am deadly

looking forward to your release!!!!!!!!!!!!!! Can't wait your amazing work

Configuration for demo-only or training-evaluation Tasks

Hi users or developers,
If you only want to deploy pre-trained MotionGPT on your local machine, we suggest you follow the following codes.
https://huggingface.co/spaces/OpenMotionLab/MotionGPT/tree/main

If you want to train and evaluate your own MotionGPT with Humanml3d datasets, we suggest you follow this source code here in GitHub:
https://github.com/OpenMotionLab/MotionGPT

Missing t2m/VQVAEV3_CB1024_CMT_H1024_NRES3

Hi, thanks for releasing the easily understandable code 😸
While I run the webui-app successfully, there are some artifacts due to missing files that I replaced probably.

--
Update:
Chaging VQVAEV3_CB1024_CMT_H1024_NRES3 to Decomp_SP001_SM001_H512 works well for generating motion!

--
Example (using mean/std from HumanML3D repo):
Can you show me that a person does three straight jumping jacks ?

dis_data_root = pjoin(cfg.DATASET.HUMANML3D.MEAN_STD_PATH, 't2m', "VQVAEV3_CB1024_CMT_H1024_NRES3", "meta")
- The provided data didn't include VQVAEV3_CB1024_CMT_H1024_NRES3 in deps/t2m/t2m
- Replaced by Mean.npy and Std.npy in HumanML3D repo
configs/webui.yaml
- Test.CHECKPOINTS: ...ckpt to ...tar
configs/lm/default.yaml
- params.model_path: ../memData/deps/flan-t5-base to google/flan-t5-base
configs/assets.yaml
- model.whisper_path: deps/whisper-large-v2 to openai/whisper-large-v2
HumanML dataset in datasets/humanml3d
- Use a single 012314.npy from HumanML3D repo as dataset

Thanks again 😄

Motion detection to text

Wonderful work! I am wondering if the model can be used to detect real-life videos of human motions and actions and caption them into text.

trajectory = trajectory - trajectory[..., 0, :] IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

Hi author,
while doing rendering, it shows the error:
trajectory = trajectory - trajectory[..., 0, :] IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed
Any help would be appreciated, thanks!

[Query] Rendering locally in Blender

Hi, so I usually run codes in a remote server via SSH and was wondering if there is a way to render the outputs in my locally installed Blender. I was checking the scripts in this repo for rendering and I found a path to blender is required, but how can I get a path from the remote server to my local machine? Can anyone pls help me with this?

Thanks in advance!

openmotionlab / motiongpt Goto Github PK

motiongpt's People

Contributors

Stargazers

Watchers

Forkers

motiongpt's Issues

Recommend Projects

Recommend Topics

Recommend Org