Git Product home page Git Product logo

defe41251135 / comp4d Goto Github PK

View Code? Open in Web Editor NEW

This project forked from vita-group/comp4d

0.0 0.0 0.0 60.94 MB

"Comp4D: LLM-Guided Compositional 4D Scene Generation", Dejia Xu*, Hanwen Liang*, Neel P. Bhatt, Hezhen Hu, Hanxue Liang, Konstantinos N. Plataniotis, and Zhangyang Wang

Home Page: https://vita-group.github.io/Comp4D/

Shell 0.31% C++ 0.10% Python 91.31% C 0.04% Cuda 0.81% Jupyter Notebook 7.43%

comp4d's Introduction

Comp4D: LLM-Guided Compositional 4D Scene Generation

The official implementation of paper "Comp4D: LLM-Guided Compositional 4D Scene Generation".

[Project Page] | [Video (narrated)] | [Video (results)] | [Paper] | [Arxiv]

News

  • 2024.4.1: Released code!
  • 2024.3.25: Released on arxiv!

Overview

overview

As show in figure above, we introduce Compositional 4D Scene Generation. Previous works concentrate on object-centric 4D objects with limited movement. In comparison, our work extends the boundaries to the demanding task of compositional 4D scene generation. We integrate GPT-4 to decompose the scene and design proper trajectories, resulting in larger-scale movements and more realistic object interactions.

Setup

conda env create -f environment.yml
conda activate Comp4D
pip install -r requirements.txt

# 3D Gaussian Splatting modules, skip if you already installed them
# a modified gaussian splatting (+ depth, alpha rendering)
git clone --recursive https://github.com/ashawkey/diff-gaussian-rasterization
pip install ./diff-gaussian-rasterization
pip install ./simple-knn

Example Case

Prompt Case

"a butterfly flies towards the flower"

Compositional Scene training

python train_comp.py --configs arguments/comp_butterfly_flower_zs.py --expname butterflyflower_exp --cfg_override 100.0 --image_weight_override 0.02 --nn_weight 1000 --with_reg  --loss_dx_weight_override 0.005

We provide a quick overview of some important arguments:

  • --expname: Experimental path.
  • --configs: Configuration of scene traning including prompt, object identity, object scales, trajectory. You can also use VideoCrafter in replace of Zeroscope for video-based diffusion model.
  • --image_weight: Weight of sds loss from image-based diffusion model.
  • --nn_weight: Weight of k-nn based rigidity loss.
  • --loss_dx_weight: Weight of regularization acceleration loss.

Rendering

python render_comp_video.py --skip_train --configs arguments/comp_butterfly_flower_zs.py --skip_test --model_path output_demo/date/butterflyflower_exp_date/ --iteration 3000

Static Assets Preparation

We release a set of pre-generated static assets in data/ directory. During training we keep the static 3D Gaussians fixed and only optimize the deformation modules. We refered to the first two stages of 4D-fy to generate the static 3D objects. Then we convert them to point clouds (in data/) which are used to initialize 3D Gaussians. Thanks the authors for sharing their awesome work!

Example case


# cd /path_to_4dfy/

## Stage 1
# python launch.py --config configs/fourdfy_stage_1_low_vram.yaml --train --gpu 0 exp_root_dir=output/ seed=0 system.prompt_processor.prompt="a flower"

## Stage 2
# ckpt=output/fourdfy_stage_1_low_vram/a_flower@timestamp/ckpts/last.ckpt
# python launch.py --config configs/fourdfy_stage_2_low_vram.yaml --train --gpu 0 exp_root_dir=output/ seed=0 system.prompt_processor.prompt="a flower" system.weights=$ckpt

## Post-Process. Convert to mesh file.
# python launch.py --config output/fourdfy_stage_2_low_vram/a_flower@timestamp/configs/parsed.yaml --export --gpu 0 \
#   resume=output/fourdfy_stage_2_low_vram/a_flower@timestamp/ckpts/last.ckpt system.exporter_type=mesh-exporter \
#   system.exporter.context_type=cuda system.exporter.fmt=obj
## saved to output/fourdfy_stage_2_low_vram/a_flower@timestamp/save/iterations-export/

## Convert to point cloud.
# cd /path_to_Comp4D/
# python mesh2ply_8w.py /path_to_4dfy/output/fourdfy_stage_2_low_vram/a_flower@timestamp/save/iterations-export/model.obj data/a_flower.ply

Acknowledgement

This work is built on many amazing research works and open-source projects. Thanks to all the authors for sharing!

Citation

If you find this repository/work helpful in your research, please consider citing the paper and starring the repo โญ.

@article{xu2024comp4d,
  title={Comp4D: LLM-Guided Compositional 4D Scene Generation},
  author={Xu, Dejia and Liang, Hanwen and Bhatt, Neel P and Hu, Hezhen and Liang, Hanxue and Plataniotis, Konstantinos N and Wang, Zhangyang},
  journal={arXiv preprint arXiv:2403.16993},
  year={2024}
}

comp4d's People

Contributors

hw-liang avatar ir1d avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.