Git Product home page Git Product logo

polyffusion's Introduction

Polyffusion: A Diffusion Model for Polyphonic Score Generation with Internal and External Controls

@inproceedings{polyffusion2023,
    author = {Lejun Min and Junyan Jiang and Gus Xia and Jingwei Zhao},
    title = {Polyffusion: A Diffusion Model for Polyphonic Score Generation with Internal and External Controls},
    booktitle = {Proceedings of the 24th International Society for Music Information Retrieval Conference, {ISMIR}},
    year = {2023}
}

Installation

pip install -r requirements.txt
pip install -e polyffusion
pip install -e polyffusion/chord_extractor
pip isntall -e polyffusion/mir_eval

Some Clarifications

  • The abbreviation "sdf" means Stable Diffusion, and "ldm" means Latent Diffusion. Basically they are referring to the same thing. However, we only borrow the cross-attention conditioning mechanism from Latent Diffusion, without utilizing its encoder and decoder. The latter is left for future experiments.
  • prmat2c in the code is the piano-roll image representation.

Training

Preparations

  • The extracted features of the dataset POP909 can be accessed here. Please put it under /data/ after extraction.

  • The needed pre-trained models for training can be accessed here. Please put them under /pretrained/ after extraction.

Modifications

  • You can modify the parameters in the corresponding *.yaml files under /polyffusion/params/, or create your own.

Commands

python polyffusion/main.py --model [model] --output_dir [output_dir]

The models can be selected from /polyffusion/params/[model].yaml. Here are some cases:

  • sdf_chd8bar: conditioned on latent chord representations encoded by a pre-trained chord encoder.
  • sdf_txt: conditioned on latent texture representations encoded by a pre-trained texture encoder.
  • sdf_chdvnl: conditioned on vanilla chord representations.
  • sdf_txtvnl: conditioned on vanilla texture representations.
  • ddpm: vanilla diffusion model from DDPM without conditioning.

Examples:

python polyffusion/main.py --model sdf_chd8bar --output_dir result/sdf_chd8bar

Trained Checkpoints

If you'd like to test our trained checkpoints, please access the folder here. We suggest to put them under /result/ after extraction for inference.

Inference

Please see the helping messages by running

python polyffusion/inference_sdf.py --help

Examples:

# unconditional generation of length 10x8 bars
python polyffusion/inference_sdf.py --chkpt_path=/path/to/checkpoint --uncond_scale=0. --length=10

# conditional generation using DDIM sampler (default guidance scale = 1)
python polyffusion/inference_sdf.py --chkpt_path=/path/to/checkpoint --ddim --ddim_steps=50 --ddim_eta=0.0 --ddim_discretize=uniform

# conditional generation with guidance scale = 5, conditional chord progressions chosen from a song from POP909 validation set.
python polyffusion/inference_sdf.py --chkpt_path=/path/to/checkpoint --uncond_scale=5.

# conditional iterative inpainting (i.e. autoregressive generation) (default guidance scale = 1)
python polyffusion/inference_sdf.py --chkpt_path=/path/to/checkpoint --autoreg

# unconditional melody generation given accompaniment
python polyffusion/inference_sdf.py --chkpt_path=/path/to/checkpoint --uncond_scale=0. --inpaint_from_midi=/path/to/accompaniment.mid --inpaint_type=above

# accompaniment generation given melody, conditioned on chord progressions of another midi file (default guidance scale = 1)
python polyffusion/inference_sdf.py --chkpt_path=/path/to/checkpoint --inpaint_from_midi=/path/to/melody.mid --inpaint_type=below --from_midi=/path/to/cond_midi.mid

polyffusion's People

Contributors

aik2mlj avatar wu-ming233 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

polyffusion's Issues

Inference errors, e.g. "Missing key model_name"

Hi, when running the first suggested example for inference,...

python polyffusion/inference_sdf.py --chkpt_path=/path/to/checkpoint --uncond_scale=0. --length=10

I downloaded the checkpoint, placed in result/ as directed, and run this:

python polyffusion/inference_sdf.py --chkpt_path=result/sdf+pop909wm_mix16_chd8bar/01-11_102022/chkpts --uncond_scale=0. --length=10

which gives the error:

Traceback (most recent call last):
  File "/home/shawley/diffusion/polyffusion/polyffusion/inference_sdf.py", line 526, in <module>
    raise FileNotFoundError(
FileNotFoundError: params.yaml or params.json not found in result/sdf+pop909wm_mix16_chd8bar, please specify custom_params_path then.

After this I try

python polyffusion/inference_sdf.py --chkpt_path=result/sdf+pop909wm_mix16_chd8bar/01-11_102022/chkpts --uncond_scale=0. --length=10  --custom_params_path=result/sdf+pop909wm_mix16_chd8bar/01-11_102022/params.yaml 

which gives this error:

Traceback (most recent call last):
  File "/home/shawley/diffusion/polyffusion/polyffusion/inference_sdf.py", line 533, in <module>
    model_label = params.model_name
  File "/home/shawley/envs/hs/lib/python3.10/site-packages/omegaconf/dictconfig.py", line 355, in __getattr__
    self._format_and_raise(
  File "/home/shawley/envs/hs/lib/python3.10/site-packages/omegaconf/base.py", line 231, in _format_and_raise
    format_and_raise(
  File "/home/shawley/envs/hs/lib/python3.10/site-packages/omegaconf/_utils.py", line 899, in format_and_raise
    _raise(ex, cause)
  File "/home/shawley/envs/hs/lib/python3.10/site-packages/omegaconf/_utils.py", line 797, in _raise
    raise ex.with_traceback(sys.exc_info()[2])  # set env var OC_CAUSE=1 for full trace
  File "/home/shawley/envs/hs/lib/python3.10/site-packages/omegaconf/dictconfig.py", line 351, in __getattr__
    return self._get_impl(
  File "/home/shawley/envs/hs/lib/python3.10/site-packages/omegaconf/dictconfig.py", line 442, in _get_impl
    node = self._get_child(
  File "/home/shawley/envs/hs/lib/python3.10/site-packages/omegaconf/basecontainer.py", line 73, in _get_child
    child = self._get_node(
  File "/home/shawley/envs/hs/lib/python3.10/site-packages/omegaconf/dictconfig.py", line 480, in _get_node
    raise ConfigKeyError(f"Missing key {key!s}")
omegaconf.errors.ConfigAttributeError: Missing key model_name
    full_key: model_name
    object_type=dict

The inference instructions in the README didn't include a model name. ...Is there a way to make this work? Thanks.

Erorr: "File exists:" after creating log directory

Thank you for sharing your code. After completing all the installation, I tried running what I thought was the appropriate command for training, but I get an error in that it seems to try to create the same directory twice, and generates an error on the second time. Is this normal? Do you have any suggestions for fixing it?

Thanks!

$ python polyffusion/main.py --model sdf_chdvnl  --output_dir /runs/shawley/polyffusion
Creating new log folder as /runs/shawley/polyffusion/24-02-14_074902
load train valid set with: {}
Dataloader ready: batch_size=16, num_workers=4, pin_memory=True, train_segments=57543, val_segments=6522 {}
Total parameters: 44686850
model_name: sdf_chdvnl
batch_size: 16
max_epoch: 100
learning_rate: 5.0e-05
max_grad_norm: 10
fp16: true
num_workers: 4
pin_memory: true
in_channels: 2
out_channels: 2
channels: 64
attention_levels:
- 2
- 3
n_res_blocks: 2
channel_multipliers:
- 1
- 2
- 4
- 4
n_heads: 4
tf_layers: 1
d_cond: 1152
linear_start: 0.00085
linear_end: 0.012
n_steps: 1000
latent_scaling_factor: 0.18215
img_h: 128
img_w: 128
cond_type: chord
cond_mode: mix
use_enc: false
chd_n_step: 32
chd_input_dim: 36
chd_z_input_dim: 512
chd_hidden_dim: 512
chd_z_dim: 512

Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/4
Creating new log folder as /runs/shawley/polyffusion/24-02-14_074907
Creating new log folder as /runs/shawley/polyffusion/24-02-14_074907
Traceback (most recent call last):
  File "/home/shawley/diffusion/polyffusion/polyffusion/main.py", line 36, in <module>
    config = LDM_TrainConfig(
  File "/home/shawley/diffusion/polyffusion/polyffusion/train/train_ldm.py", line 31, in __init__
    super().__init__(params, None, output_dir)
  File "/home/shawley/diffusion/polyffusion/polyffusion/train/__init__.py", line 40, in __init__
    os.makedirs(output_dir)
  File "/usr/lib/python3.10/os.py", line 225, in makedirs
    mkdir(name, mode)
FileExistsError: [Errno 17] File exists: '/runs/shawley/polyffusion/24-02-14_074907'
Creating new log folder as /runs/shawley/polyffusion/24-02-14_074907
Traceback (most recent call last):
  File "/home/shawley/diffusion/polyffusion/polyffusion/main.py", line 36, in <module>
    config = LDM_TrainConfig(
  File "/home/shawley/diffusion/polyffusion/polyffusion/train/train_ldm.py", line 31, in __init__
    super().__init__(params, None, output_dir)
  File "/home/shawley/diffusion/polyffusion/polyffusion/train/__init__.py", line 40, in __init__
    os.makedirs(output_dir)
  File "/usr/lib/python3.10/os.py", line 225, in makedirs
    mkdir(name, mode)
FileExistsError: [Errno 17] File exists: '/runs/shawley/polyffusion/24-02-14_074907'
load train valid set with: {}
[rank: 2] Child process with PID 57408 terminated with code 1. Forcefully terminating all other processes to avoid zombies ๐ŸงŸ
Killed

(Before running, the directory /runs/shawley/polyffusion/ is completely empty)

PS I get a similar "File exists" error when running the command copied from the README:

$ python polyffusion/main.py --model sdf_chd8bar --output_dir result/sdf_chd8bar

DataLoader worker is killed by signal

The learning code polyffusion/main.py does not execute as expected.
I learned using dataset POP909 as written in the readme.
however DataLoader worker is killed by signal occurs.
I verified it several times. It seems to always occur at number 1034.

Epoch 0:  29% 1034/3597 [51:02<2:06:30,  2.96s/it]   
Traceback (most recent call last):
  File "polyffusion/polyffusion/main.py", line 71, in <module>
    config.train()
  File "polyffusion/polyffusion/train/__init__.py", line 49, in train
    learner.train(max_epoch=self.params.max_epoch)
  File "polyffusion/polyffusion/learner.py", line 137, in train
    losses, scheduled_params = self.train_step(batch)
  File "polyffusion/polyffusion/learner.py", line 200, in train_step
    loss_dict = self.model.get_loss_dict(batch, self.step)
  File "polyffusion/polyffusion/models/model_sdf.py", line 189, in get_loss_dict
    cond = self._encode_chord(chord)
  File "polyffusion/polyffusion/models/model_sdf.py", line 95, in _encode_chord
    z = self.chord_enc(chord).mean
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)


  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
    _error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 2978) is killed by signal: Killed. 

@aik2mlj

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.