project-monai / generativemodels Goto Github PK

MONAI Generative Models makes it easy to train, evaluate, and deploy generative models and related applications

License: Apache License 2.0

Python 8.91% Jupyter Notebook 90.93% Shell 0.17%

anomaly-detection diffusion-models generative-adversarial-network generative-models image-synthesis image-translation medical-imaging monai mri-reconstruction

generativemodels's Introduction

MONAI Generative Models

Prototyping repository for generative models to be integrated into MONAI core, MONAI tutorials, and MONAI model zoo.

Features

Network architectures: Diffusion Model, Autoencoder-KL, VQ-VAE, Autoregressive transformers, (Multi-scale) Patch-GAN discriminator.
Diffusion Model Noise Schedulers: DDPM, DDIM, and PNDM.
Losses: Adversarial losses, Spectral losses, and Perceptual losses (for 2D and 3D data using LPIPS, RadImageNet, and 3DMedicalNet pre-trained models).
Metrics: Multi-Scale Structural Similarity Index Measure (MS-SSIM) and Fréchet inception distance (FID).
Diffusion Models, Latent Diffusion Models, and VQ-VAE + Transformer Inferers classes (compatible with MONAI style) containing methods to train, sample synthetic images, and obtain the likelihood of inputted data.
MONAI-compatible trainer engine (based on Ignite) to train models with reconstruction and adversarial components.
Tutorials including:
- How to train VQ-VAEs, VQ-GANs, VQ-VAE + Transformers, AutoencoderKLs, Diffusion Models, and Latent Diffusion Models on 2D and 3D data.
- Train diffusion model to perform conditional image generation with classifier-free guidance.
- Comparison of different diffusion model schedulers.
- Diffusion models with different parameterizations (e.g., v-prediction and epsilon parameterization).
- Anomaly Detection using VQ-VAE + Transformers and Diffusion Models.
- Inpainting with diffusion model (using Repaint method)
- Super-resolution with Latent Diffusion Models (using Noise Conditioning Augmentation)

Roadmap

Our short-term goals are available in the Milestones section of the repository.

In the longer term, we aim to integrate the generative models into the MONAI core repository (supporting tasks such as, image synthesis, anomaly detection, MRI reconstruction, domain transfer)

Installation

To install the current release of MONAI Generative Models, you can run:

pip install monai-generative

To install the current main branch of the repository, run:

pip install git+https://github.com/Project-MONAI/GenerativeModels.git

Requires Python >= 3.8.

Contributing

For guidance on making a contribution to MONAI, see the contributing guidelines.

Community

Join the conversation on Twitter @ProjectMONAI or join our Slack channel.

Citation

If you use MONAI Generative in your research, please cite us! The citation can be exported from the paper.

generativemodels's People

Contributors

Stargazers

Watchers

Forkers

hertera1 ss-sun luisoutomaior soumbane devhliu moudgalyakvs shengzhang90 jacksky64 haiyunsky artofnext qilei123 techthiyanes fanfeum oesllelucena diningsystem ycremar themantalope guopengf ci-ber austintapp mingxin-zheng chadhgy yiheng-wang-nv xyztlp kp-forks msimnach musetee dimarond dongyang0122 davidko3 juampatronics israrbacha haozhi1817 mayuresh07 abdelabys xiaojiean815 postmarone96 ivanslootweg felixquinton1 soloche skyrockets-21 greatnessbrain matthiaslen changchun-yang virtualnew dahui-y ego benny0323 nclgbd guohanzhong marcosgarcia75 gokul-krishnan-r aeroelasticitylu hana-sebia bernas04 yonatanu fraware matanat loopback-kr tariasvergara anservat dieineb tony-honey bmwas wenyuanchen1326 bhushan-ucsf stomper10 stijnvwijn lapisco gbrussieux prathmesh523 fbaratov yasinzaii cdchenlin zgy600 aswahd aamir-m-khan kumoliu matheus-rech alainfidahoussen sihamphd

generativemodels's Issues

Create VQ-VAE tutorial using 2D and 3D data

Create pure PyTorch (no MONAI engines or Ignite engines) tutorial using 2D and 3D open source datasets.

Add unit testing capabilities

Looking at one of MONAI Core U-NETR's unit-test we seem to be missing parameterized from our requirements-dev.txt and we seem to be needing parts of the tests folder in MONAI Core, specifically the utils.py.

How should we proceed?

PatchDiscriminator has a different architecture from original paper and VQGAN/LDM implementation

Comparing the code between PatchDiscriminator and its original implementation (https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix/blob/e2c7618a2f2bf4ee012f43f96d1f62fd3c3bec89/models/networks.py#L539) and the VQGAN implementation (https://github.com/CompVis/taming-transformers/blob/3ba01b241669f5ade541ce990f7650a3b8f65318/taming/modules/discriminator/model.py#L17), it looks like our implementation have some points that could be similar to original one.

Using these parameters (to simulate the VQGAN's network)

    spatial_dims: 2
    num_channels: 64
    num_layers_d: 3
    in_channels: 1
    out_channels: 1
    kernel_size: 4
    activation: "LEAKYRELU"
    norm: "BATCH"
    bias: False
    padding: 1

We get this from the original implementation:

Discriminator(
  (main): Sequential(
    (0): Conv2d(1, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
    (1): LeakyReLU(negative_slope=0.2, inplace=True)
    (2): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (3): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (4): LeakyReLU(negative_slope=0.2, inplace=True)
    (5): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (6): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (7): LeakyReLU(negative_slope=0.2, inplace=True)
    (8): Conv2d(256, 512, kernel_size=(4, 4), stride=(1, 1), padding=(1, 1), bias=False)
    (9): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (10): LeakyReLU(negative_slope=0.2, inplace=True)
    (11): Conv2d(512, 1, kernel_size=(4, 4), stride=(1, 1), padding=(1, 1))
  )
)

and in our implementation we are getting this

PatchDiscriminator(
  (0): Convolution(
    (conv): Conv2d(1, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (adn): ADN(
      (N): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (D): Dropout(p=0.0, inplace=False)
      (A): LeakyReLU(negative_slope=0.01)
    )
  )
  (1): Convolution(
    (conv): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (adn): ADN(
      (N): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (D): Dropout(p=0.0, inplace=False)
      (A): LeakyReLU(negative_slope=0.01)
    )
  )
  (2): Convolution(
    (conv): Conv2d(256, 512, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (adn): ADN(
      (N): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (D): Dropout(p=0.0, inplace=False)
      (A): LeakyReLU(negative_slope=0.01)
    )
  )
  (final_conv): Convolution(
    (conv): Conv2d(512, 1, kernel_size=(1, 1), stride=(1, 1), bias=False)
    (adn): ADN(
      (N): BatchNorm2d(1, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (D): Dropout(p=0.0, inplace=False)
      (A): LeakyReLU(negative_slope=0.01)
    )
  )
)

There are 3 issues in our models:

In our implementation, we are skipping the first convolution and activation ((0): Conv2d(1, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1)) (1): LeakyReLU(negative_slope=0.2, inplace=True)) and having it starting with 128 channels instead 64 (caused by this line

GenerativeModels/generative/networks/nets/patchgan_discriminator.py

Line 154 in d9b7200

output_channels = num_channels * 2

).
In the (2): Convolution, we are using stride 2 instead stride 1 (as shown in here https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix/blob/e2c7618a2f2bf4ee012f43f96d1f62fd3c3bec89/models/networks.py#L574).
The final_conv does not have NDA. It should be conv_only=True

Should we use jutypertext to autogenerate .ipynb files?

Hi @ericspod, I am working on generating some tutorials for the Generative Models. I was wondering if we can use jupytext (https://jupytext.readthedocs.io/en/latest/index.html) to autogenerate .py or .ipynb files? I saw that you have used it on the core MONAI library (Project-MONAI/tutorials#164) and had some issues that seemed to be solved. Would it be ok if we use it and do you have any opinions about it?

Add pretrained model on brain data to model zoo

Make one (or some) pretrained models avaiable at MONAI's model zoo (https://github.com/project-monai/model-zoo)

Should we create common interface to support a second stage model?

@ericspod Should we create a common interface between VQ-VAE and AEKL to make them easy to interchange with each other when we are getting the latent representations (example proposed here #13 (comment))?

It would make these two models different from the MONAI models (such as https://github.com/Project-MONAI/MONAI/blob/dev/monai/networks/nets/varautoencoder.py and https://github.com/Project-MONAI/MONAI/blob/dev/monai/networks/nets/autoencoder.py) and future generative models.

Add DDPM class

Create a class for the denoising diffusion probabilistic model similar to the LDM class (

GenerativeModels/generative/networks/nets/latent_diffusion_model.py

Line 16 in e344d93

class LatentDiffusionModel(nn.Module):

). Similar to the LDM, it should have a scheduler and a unet attributes, and a sampling method.

Missing unit tests for networks/layers/vector_quantizer.py

It is missing unit tests to check the behavior of classes in networks/layers/vector_quantizer.py

Add latent diffusion inferer

Similar to DDPM Inferer,

GenerativeModels/generative/inferers/inferer.py

Line 22 in 479cdee

class DiffusionInferer(Inferer):

we need an inferer for the Latent Diffusion Model with the stage 1 model. To get this ready, we need to adopt a common interface (#22) for the models compatible with LDM's stage 1 (i.e. VQVAE and AEKL).

Change AutoencoderKL's attn_resolutions parameter to a more user-friendly approach

Currently, the AutoencoderKL defines which downsampled/upsampled levels should have attention mechanisms via the feature maps resolutions (based on attn_resolutions). As @virginiafdez recommended, using a list where we specify the levels themselves could be a better solution.

Add a diffusion sampling inferer

Based on discussion here.

Create AutoencoderKL tutorial using 2D data and torch

Create a simple tutorial using AutoencoderKL on the 2D MEDNIST data. In order to be faster to train the model, do not use all classes from MEDNIST (a couple of them will be enough to train later the conditioned diffusion models). In this tutorial, use pure torch when creating the training step and valuation (no frameworks like ignite or lightning).

Create VQ-GAN tutorial on 2D data

Create a simple tutorial using VQ-GAN on 2D datasets from MONAI.

Create Trainer with Reconstruction + Adversarial components

Create Trainer engine to train VQ-VAEs and AE-KL that contains reconstruction based losses and adversarial losses. It should contain relevant unit tests and documentation.

Add 3D Feature Loss

Add 3D Feature Loss based on MedNet as an alternative to 2D and 2.5D perceptual losses

Add VQ-VAE network

Create the VQ-VAE network for 2D and 3D cases with a Vector Quantisation component using exponential moving averages to update the dictionary. Add the relevant unit tests and documentation.

Add latent diffusion model network

Add Latent Diffusion model for 2D and 3D cases, including tests and documentation.

Add AutoencoderKL

Create an AutoencoderKL for 2D and 3D, including unit tests and docstrings.

Changes in adversarial components

Currently, there is a problem with

GenerativeModels/generative/losses/adversarial_loss.py

Line 121 in 479cdee

if not for_discriminator:

Even if we are using target_is_real = True, it present the warning message saying that target_is_real = False.

Besides that, we can make the PatchDiscriminator more independent from MultiScalePatchDiscriminator

Add Transformer network

Add transformer network and components to make it compatible with VQ-VAE network. Create the components necessary to generate samples and likelihood of inputted data from the model. Add the relevant unit tests and documentation.

Add installer

Add setup.py and necessary files to be able to install this package with prototypes.

Implement PLMS sampler

The paper shows better samples in 50 steps compared with 1000-step DDIMs (20x speedup).

There is a reference implementation in the latent diffusion repo.

Tutorial training AEKL with 3D data

Create a simle tutorial/example training AEKL models using 3D datasets from MONAI.

Create the Adversarial Loss

Create the adversarial loss based on the Patch-GAN model and LS-GAN loss.

Create Latent Diffusion Model tutorial using 3D data

Create tutorial for denoising diffusion probabilistic models in 3D

Creating a tutorial training and evaluating a DDPM model on a 3D dataset. Might it is worth to focus in the dataset that monai have available (decathlon dataset)

Create a tutorial using Ignite for the 2D DDPM

VQGAN tutorial using features in the discriminator loss instead just last layers

To compute the discriminator loss we are using all features in

GenerativeModels/tutorials/generative/2d_vqgan/2d_vqgan_tutorial.py

Line 248 in dd08bff

logits_fake = discriminator(reconstruction.contiguous().detach())

and

GenerativeModels/tutorials/generative/2d_vqgan/2d_vqgan_tutorial.py

Line 250 in dd08bff

logits_real = discriminator(images.contiguous().detach())

it should use just the last features like the generator

GenerativeModels/tutorials/generative/2d_vqgan/2d_vqgan_tutorial.py

Line 237 in dd08bff

logits_fake = discriminator(reconstruction.contiguous().float())[-1]

Improve too slow / too many overlapping of networks' unit tests

Currently the networks are taking significant amount of time in the unit tests. It can be improved by remove redundancy and using smaller networks in the tests.

For example:
For the AutoencoderKL

GenerativeModels/tests/test_autoencoderkl.py

Line 38 in dd08bff

TEST_CASE_1 = [

and

GenerativeModels/tests/test_autoencoderkl.py

Line 23 in dd08bff

TEST_CASE_0 = [

are building similar networks with same components, having 2 test cases that do not increase coverage.

Other example:
For the VQVAE

GenerativeModels/tests/test_vqvae.py

Line 25 in dd08bff

[1, 3], # Batch size

is testing the network forward using a single image or 3 example in the minibatch. These 2 test cases does not test different methods of the network or different conditions, and this do not test any part added in the VQVAE class.

Add Frequency Losses

Add Jukebox and Hartley frequency losses

Add RadImageNet option to Perceptual loss

Add the option to select between LPIPS based implementation of the perceptual loss (based on model pretrained on ImageNet) or the RadImageNet option (based on a network pretrained no the RadImageNet dataset) (https://github.com/BMEII-AI/RadImageNet).

The use of this network in a similar approach was already implemented for a FID metric in https://github.com/RichardObi/medigan which might be interesting to use too.

Use channel multiplier on VQVAE

Similar to the DiffusionModelUNet (

GenerativeModels/generative/networks/nets/diffusion_model_unet.py

Line 671 in 92f87e5

channel_mult: Sequence[int] = (1, 2, 4, 8),

) and the AutoencoderKL (

GenerativeModels/generative/networks/nets/autoencoderkl.py

Line 503 in 92f87e5

ch_mult: Sequence[int],

), adopt a channel_mult approach for the VQVAE

Create Latent Diffusion Model tutorial using 2D data

Update tutorial on VQGAN to correct for mismatch text

Fixing this line:

GenerativeModels/tutorials/generative/2d_vqgan/2d_vqgan_tutorial.py

Line 157 in 0a38850

 # At this step, we instantiate the MONAI components to create a DDPM, the UNET and the noise scheduler. We are using 

To reflect the VQGAN model

Create tutorial for denoising diffusion probabilistic models in 2D

Create 2D example using DDPM (not LDMs)

Typo "no_levels" in VQVAE class

GenerativeModels/generative/networks/nets/vqvae.py

Line 160 in 4060615

 f"downsample_parameters, upsample_parameters must have the same number of elements as no_levels. " 

it should be

f"downsample_parameters, upsample_parameters must have the same number of elements as num_levels. "

Add 2D/3D to Sequence Transformation

Add the Ordering class that allows different processing to an input latent representation.

Torch.asarray not available in earlier torch versions

According to our requirements.txt we support torch>=1.7. If this is the case we'll need to remove use of torch.asarray, e.g. in this tutorial, as the method was only introduced in v1.11

We can replace e.g. torch.asarray((t,))) with torch.Tensor((t,)))

Create tutorials using MONAI Bundle to define the generative models

Create tutorial to train AEKL, VQVAEs or VQGANs + LDMs using monai bundle

Add Perceptual loss

Add Perceptual Similarity metric to be used as loss. It should be compatible with a 2D and 2.5D approach.

Fix Torchscript error in latent diffusion models unet network

Try to include torchscript tests for the latent diffusion unet networks. For this, it might be necessary to remove the use of TimestepEmbedSequential and TimestepBlock by creating blocks exclusive to when we have the time embedding included.
A similar solution to Huggingface diffuser might be useful:
https://github.com/huggingface/diffusers/blob/2d35f6733a2d698e8917896071444a5923993ae7/src/diffusers/models/unet_blocks.py#L461
https://github.com/huggingface/diffusers/blob/2d35f6733a2d698e8917896071444a5923993ae7/src/diffusers/models/unet_blocks.py#L379
https://github.com/huggingface/diffusers/blob/2d35f6733a2d698e8917896071444a5923993ae7/src/diffusers/models/unet_blocks.py#L576

Add DDPM Scheduler

Add the variance scheduler proposed in the DDPM paper (with linear and cosine options) similar to huggingface code style.

Add Fréchet inception distance as metric

Add Fréchet inception distance as a new metric. It should allow to use 2D and 3D inputs. For 2D, it should have the option to chose to use a network pretrained on Imagenet (from torchvision) or a network pretrained on radimagenet (https://github.com/RichardObi/medigan/blob/18283898015cf7ac98ffc499e7f15b5ddabb57b6/tests/fid.py or https://github.com/BMEII-AI/RadImageNet). For 3D, it should use MedicalNet3D (https://github.com/Tencent/MedicalNet)

DataLoader Usage

Adding persistent_workers=True to the arguments for any DataLoader object will speed the training process since it won't have to recreate processes at each epoch. This helped a lot in Windows and may be less helpful elsewhere, and it should be tested with notebooks and scripts. ThreadDataLoader may also provide some additional improvement.

Add MSSIM metric

Add Mean Structural Similarity as metric, including unit tests and documentation.

Fix perceptual loss 3D

When calling the perceptual loss with 3D images I saw the following error:

TypeError                                 Traceback (most recent call last)
Cell In [12], line 25
22 reconstruction, z_mu, z_sigma = model(images)
24 mse_loss = F.mse_loss(reconstruction.float(), images.float())
---> 25 p_loss = perceptual_loss(reconstruction.float(), images.float())
27 kl_loss = 0.5 * torch.sum(z_mu.pow(2) + z_sigma.pow(2) - torch.log(z_sigma.pow(2)) - 1, dim=[1, 2, 3])
28 kl_loss = torch.sum(kl_loss) / kl_loss.shape[0]

File ~/miniconda3/envs/genmodels/lib/python3.9/site-packages/torch/nn/modules/module.py:1190, in Module._call_impl(self, *input, **kwargs)
1186 # If we don't have any hooks, we want to skip the rest of the logic in
1187 # this function, and just call forward.
1188 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1189         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1190     return forward_call(*input, **kwargs)
1191 # Do not call functions when jit is used
1192 full_backward_hooks, non_full_backward_hooks = [], []

File /mnt_homes/home4T7/jdafflon/GenerativeModels/generative/losses/perceptual.py:124, in PerceptualLoss.forward(self, input, target)
121     loss = self.perceptual_function(input, target)
122 elif self.spatial_dims == 3 and self.is_fake_3d:
123     # Compute 2.5D approach
--> 124     loss_sagittal = self._calculate_axis_loss(input, target, spatial_axis=2)
125     loss_coronal = self._calculate_axis_loss(input, target, spatial_axis=3)
126     loss_axial = self._calculate_axis_loss(input, target, spatial_axis=4)

File /mnt_homes/home4T7/jdafflon/GenerativeModels/generative/losses/perceptual.py:96, in PerceptualLoss._calculate_axis_loss(self, input, target, spatial_axis)
87 input_slices = batchify_axis(
88     x=input,
89     fake_3d_perm=(
(...)
93     + tuple(preserved_axes),
94 )
95 indices = torch.randperm(input_slices.shape[0])[: int(input_slices.shape[0] * self.fake_3d_ratio)]
---> 96 input_slices = input_slices[indices]
97 target_slices = batchify_axis(
98     x=target,
99     fake_3d_perm=(
(...)
103     + tuple(preserved_axes),
104 )
105 target_slices = target_slices[indices]

File ~/miniconda3/envs/genmodels/lib/python3.9/site-packages/monai/data/meta_tensor.py:274, in MetaTensor.torch_function(cls, func, types, args, kwargs)
272 else:
273     unpack = False
--> 274 ret = MetaTensor.update_meta(ret, func, args, kwargs)
275 return ret[0] if unpack else ret

File ~/miniconda3/envs/genmodels/lib/python3.9/site-packages/monai/data/meta_tensor.py:218, in MetaTensor.update_meta(rets, func, args, kwargs)
214 # if using e.g., batch[:, -1] or batch[..., -1], then the
215 # first element will be slice(None, None, None) and Ellipsis,
216 # respectively. Don't need to do anything with the metadata.
217 if batch_idx not in (slice(None, None, None), Ellipsis, None) and idx == 0:
--> 218     ret_meta = decollate_batch(args[0], detach=False)[batch_idx]
219     if isinstance(ret_meta, list):  # e.g. batch[0:2], re-collate
220         ret_meta = list_data_collate(ret_meta)

TypeError: only integer tensors of a single element can be converted to an index

Wrong argument names in DiffusionInferer

In the init method of DiffusionInferer, it is missing to define the arguments in the docstring

GenerativeModels/generative/inferers/inferer.py

Line 22 in e0bab16

class DiffusionInferer(Inferer):

Besides that, the name of the arguments in the call method is wrong

project-monai / generativemodels Goto Github PK

generativemodels's Introduction

MONAI Generative Models

Features

Roadmap

Installation

Contributing

Community

Citation

Links

generativemodels's People

Contributors

Stargazers

Watchers

Forkers

generativemodels's Issues

Recommend Projects

Recommend Topics

Recommend Org