deepregnet / deepreg Goto Github PK

Medical image registration using deep learning

License: Apache License 2.0

Python 99.83% Dockerfile 0.17%

image-registration medical-image-registration image-fusion deep-learning deep-neural-networks neural-network convolutional-neural-networks tensorflow2 deepreg

deepreg's Introduction

Package
Documentation
Code
Papers

DeepReg

DeepReg is a freely available, community-supported open-source toolkit for research and education in medical image registration using deep learning.

TensorFlow 2-based for efficient training and rapid deployment;
Implementing major unsupervised and weakly-supervised algorithms, with their combinations and variants;
Focusing on growing and diverse clinical applications, with all DeepReg Demos using open-accessible data;
Simple built-in command line tools requiring minimal programming and scripting;
Open, permissible and research-and-education-driven, under the Apache 2.0 license.

Getting Started

Contributing

Get involved, and help make DeepReg better! We want your help - Really.

Being a contributor doesn't just mean writing code. Equally important to the open-source process is writing or proof-reading documentation, suggesting or implementing tests, or giving feedback about the project. You might see the errors and assumptions that have been glossed over. If you can write any code at all, you can contribute code to open-source. We are constantly trying out new skills, making mistakes, and learning from those mistakes. That's how we all improve, and we are happy to help others learn with us.

Code of Conduct

This project is released with a Code of Conduct. By participating in this project, you agree to abide by its terms.

Where Should I Start?

For guidance on making a contribution to DeepReg, see our Contribution Guidelines.

Have a registration application with openly accessible data? Consider contributing a DeepReg Demo.

MICCAI 2020 Educational Challenge

Our MICCAI Educational Challenge submission on DeepReg is an Award Winner!

Check it out here - you can also

Overview Video

Members of the DeepReg dev team presented "The Road to DeepReg" at the Centre for Medical Imaging Computing (CMIC) seminar series at University College London on the 4th of November 2020. You can access the talk here.

Citing DeepReg

DeepReg is research software, made by a team of academic researchers. Citations and use of our software help us justify the effort which has gone into, and will keep going into, maintaining and growing this project.

If you have used DeepReg in your research, please consider citing us:

Fu et al., (2020). DeepReg: a deep learning toolkit for medical image registration. Journal of Open Source Software, 5(55), 2705, https://doi.org/10.21105/joss.02705

Or with BibTex:

@article{Fu2020,
  doi = {10.21105/joss.02705},
  url = {https://doi.org/10.21105/joss.02705},
  year = {2020},
  publisher = {The Open Journal},
  volume = {5},
  number = {55},
  pages = {2705},
  author = {Yunguan Fu and Nina Montaña Brown and Shaheer U. Saeed and Adrià Casamitjana and Zachary M. C. Baum and Rémi Delaunay and Qianye Yang and Alexander Grimwood and Zhe Min and Stefano B. Blumberg and Juan Eugenio Iglesias and Dean C. Barratt and Ester Bonmati and Daniel C. Alexander and Matthew J. Clarkson and Tom Vercauteren and Yipeng Hu},
  title = {DeepReg: a deep learning toolkit for medical image registration},
  journal = {Journal of Open Source Software}
}

deepreg's People

Contributors

Stargazers

Watchers

Forkers

nmontanabrown acasamitjana agrimwood knvsmadhav devhliu shannonxtreme detan rrrfrr leanne999 white-hy armaneshaghi 630084142 zlinzju daiep nanyomy xrosliang tianlili1 arunadevikaruppasamy wangxiaoxian-github hp192080042 00bryan123 tonysmum zcemycl markpinnock iact-medical-image-processing cv-ip zy20030535 teamubuntu usmanumar2010 yipenghu peterzhousz snehashis1997 medical-projects liupf201 vsaase mertyergin mianasbat zacbaum liminenjoy delfosseaurelien sicongluucl manjumit nalsadi songxiao-tt zhiyuan-w gl-xs ayoubbenc zhuyinheng simontu hushunbo yuqingcai0316 christianmarzahl super-lyc yml-bit knowyournme1 harshelbahl nuaazs mikami520 zhcv zikai1 tjhendrickson pokeblow limingzhu23333 trellixvulnteam alessandrocasella flavell-lab salamullahuet andresprados mfkiwl finspire13 ayushchatur pas195pitt radreports medicalimageanalysistutorials riannac

deepreg's Issues

Tutorial - Get started with demos

Overview on demos
add links to individual demos
add demo.md in /tutorials
add link to the wiki Tutorial Index

Refactoring from click to argparse for command line interface

Currently all the click commands exist outside of the main() functions as global vars which are then used in the command line interface, which makes them not PEP8 compliant and probably a bit nightmarish to test further along the line.

Refactor (/remove) reliance on click by converting any CLI interaction to argparse module.

Scripts which use click:

deepreg/gen_tfrecord.py
deepreg/predict.py
deepreg/train.py

Create a command line app to generate config files from argparse.

Issue description

Currently, the package depends on .yaml files / config paths to run registrations. It would be useful to create a ~~guide and~~ CLI app for users to generate their own configuration files to customise trainings.

## Type of issue
Please delete options that are not relevant.

Bug report
New feature request
[] Documentation update
Test request
Linting request

For bug report or feature request, steps to reproduce the issue, including (but not limited to):

N/A

What's the expected result?

A CLI is provided which can be called to generate the necessary config files required with their own parameters
~~2. A guide informing on what information is needed in these config files and why they are required.~~ This is covered in issue #36.

What's the actual result?

Users currently have to design yaml files manually by inspecting yaml files provided in deepreg/config. It's not entirely clear what these parameters are or why they are necessary, what expected range should be required or how wrong values could break the training - I would assume the guesswork involved could prove quite frustrating at the end point.

Additional details / screenshot

Missing docstrings - metric

deepreg/model/metric.py

Missing module docstring
Missing class and method docstrings: MeanWrapper
Missing class and method docstrings: MeanDiceScore
Missing class and method docstrings: MeanCentroidDistance
Missing class and method docstrings: MeanForegroundProportion

Restructure README to a quickstart guide with pointers to the relevant documentation

see also #6
comment what needs to be included
mirror the two "get started"
links to tutorials
links to demos

Test + docs: classical registration toy examples for integration testing purposes

Such a feature can be a nice simple integration test to ensure for example that the gradients are useable.

A basic toy example could involve translating an image and registering it with itself.

If useful, this quick and dirty notebook could be a source of inspiration:
https://colab.research.google.com/drive/1OmU-jTvWmRRVerc23nsyNz_p8jU3S7wX?usp=sharing

Demo - MR Brain (learn2reg t4)

Inter-subject neural image registration
data: https://drive.google.com/drive/folders/1Plum9c41tHPmA84-L0K0RrIztictFuNg

This demo tests:
1 - unsupervised inter-subject
2 - weakly supervised inter-subject
3 - unsupervised inter-subject with weak supervision

Adding affine transformation option

Missing docstrings - optimizer

deepreg/model/optimizer.py

Missing module docstring
Missing function docstring: get_optimizer

Add linter to CI to ensure PEP8 improvement before merge

We can add a pre-commit check which verifies the code format before committing.

Mutual information loss

Adding (normalised) mutual information as an image dissimilarity loss

Consider the relevance of custom regularised interpolation gradient

This is related to #28 and #26

It seems like autodiff and linear interpolation don't play very nicely together (at least for classical registration approaches). It would be interesting to evaluate how regularised gradients can help and how they would impact learning-based registration as well.

Making even a toy example using autodiff and linear interpolation converge seems indeed complicated:
https://colab.research.google.com/drive/1OmU-jTvWmRRVerc23nsyNz_p8jU3S7wX?usp=sharing

The toy example simply involves registering two slightly offset crops of a 2D image. This is using MSE, translation and BFGS optimiser.

The reason for the convergence issue I think is related to resampling artifacts. deepreg/pytorch/jax currently offers only linear of nearest neighbour resampling, e.g.:
https://pytorch.org/docs/master/nn.functional.html#grid-sample

One option discussed in #26 could be to port the niftyreg-based resampling that had been integrated in niftynet:
https://github.com/NifTK/NiftyNet/tree/dev/niftynet/contrib/niftyreg_image_resampling

Another option is to continue use linear interpolation but implement gradient regularisation. When autograd is used with SSD and linear resampling, you indeed get non negligible local peaks in the cost function and is gradient when the translation parameters are integers.

Obviously, it's not ideal that autograd apparently can't be used directly for classical registration.

In ITK and other classical registration toolkits that do not use autodiff, we always relied on a chain rule for the continuous formulation that eventually implied using the gradient of the moving image (e.g. -2(F-MoT)∇MoT), thereby approximating the gradient of the actual cost function + resampling scheme. This actually has a smoothing/regularisation effect on the gradient which helps avoid local minima at pixel locations.

Note that further regularisation can be obtained by computing ∇MoT using a convolution with the gardient of a Gaussian kernel (see e.g. https://itk.org/Doxygen/html/classitk_1_1DiscreteGaussianDerivativeImageFilter.html)

One could think of implementing a custom interpolation layer that would return such a regularised gradient.

I guess one other option is to change the cost function to use a proper integral rather than a simple resampling the MSE on pixels but that doesn't sound super convenient either. This is briefly discussed in Modersitzki's classical text book (e.g. chapter 6.2):
https://archive.siam.org/books/fa06/Modersitzki_FAIR_2009_FA06.pdf

Adding a bit of noise on the coordinates of the grid points (i.e. some form of dithering) is another form of regularisation wich seems to help.

In terms of further references on this issue. The litterature seems sparse. I saw the paper below earlier but it does not discuss the gradient question:
https://doi.org/10.1109/TIP.2012.2224356

Add Travis CI

Demo - CT Abdominal (learn2reg t3)

Abdominal Organ CT Registration
data: https://learn2reg.grand-challenge.org/Dataset/
https://drive.google.com/file/d/1aWyS_mQ5n7X2bTk9etHrn5di2-EZEzyO/view

This demo can test:
1 - unpaired inter-subject reg with/without labels
2 - weakly supervised

Tutorial - Configuration options

The detailed information on config files and their key-value pairs
add configurations.md in /tutorials
add link to the wiki Tutorial Index

Argparse broken for train and predict scripts due to incorrect argument in adding --gpu_allow_growth flag.

Issue description

The argparse implementation for the flag ‘--gpu_allow_growth ‘ in deepreg/train and deepreg/predict returns an error. The expected behaviour is described in the README where gpu_allow_growth should be an optional flag which instructs TF whether or not to allocate whole GPU memory or not. Calling the command line applications returns an error indicating one of the arguments is not valid. a) Need to fix this error and b) reflect the outlined functionality in the README at #52 for argparse where --gpu_allow_growth is a flag with no input.
Additionally, the docs for argparse commands have been updated in the readme at #52, it would be nice to update the docs in the command line interfaces such that calling --help is actually helpful/informative to the end user.

## Type of issue
Please delete options that are not relevant.

For bug report or feature request, steps to reproduce the issue, including (but not limited to):

OS: MacOS High Sierra 10.13.6
Environment to reproduce: pip install .e in an empty venv like instructed in README.
What commands run / sample script used: call command line applications like predict --help or train --help.

What's the expected result?

Calling predict/train --help should print a list of arguments, descriptions of the arguments and a description of the command line application.
We get informative docs when we call --help such that users can easily navigate CLI.

What's the actual result?

Calling the above commands result in: TypeError: init() got an unexpected keyword argument 'show_default' meaning that show_default isn't an argument we can pass to the argparse function..
The arg descriptions do not match the updated README ones at #52 and as such could be confusing to end user.

Additional details / screenshot

Should be a relatively easy fix by removing the line "show_default = True" in the argument --gpu_allow_growth for both train and predict scripts (line 145 in train, line 232 in predict).

Demo minimum requirement

v0.1

Each demo will have an independent folder directly under the 'Demos'. For simplicity, avoiding sub-folders and additional functions/classes.

Open accessible data

Each demo should have a 'data.py' script
Preferably, data are hosted in a reliable and efficient (not store in this repo please).

Training

Each demo should have a 'train.py' script
This is accompanied by a config file in the same folder;

Predicting

Each demo should have a 'predict.py' script
Ideally,

A pre-trained model will be available for downloading (not store in this repo please);
Provide at least one piece of numerical metric (Dice, distance error, etc) and one piece of visualisation to show the efficacy of the registration (optimum performance is not discussed here);

A 'readme.md' file

Briefly describe the clinical application and the need for registration
Acknowledge data source.

Tutorial - add new loss

The detailed steps how to add a new loss
add add_loss.md in /tutorials
add link to the wiki Tutorial Index

Support multiple configuration files

For now, all the configurations are put into the same file.
We should split it into data and model config, and when launching experiments, we can pass multiple configs and merge them into one

Tutorial - add new data loader

The detailed steps how to add a new data loader
add add_loader.md in /tutorials
add link to the wiki Tutorial Index

Tutorial - data sampling methods explained

Issue description

The options for sampling groups, image pairs and labels

Type of Issue

Please delete options that are not relevant.

Documentation update

What's the expected result?

clearly described method that us consistent with data loaders

What's the actual result?

Additional details / screenshot

Modify output tests to pytest style and parallelise to make CI more efficient.

Issue description

CI times out at 30 minutes which causes CI to fail for h5 data loaders. The total expected time it should take to finish all tests is around 40-50 minutes so an extension to the timeout is likely to fix the problem

Multi-folder support

Issue description

Additional support to enable training data can be sampled from multiple folders.

Type of Issue

Please delete options that are not relevant.

New feature request
Documentation update
Test request
Linting request

What's the expected result?

multiple folders can be specified in yaml config file;
the training data from all these folders will be considered (as though they are coming from the same single folder)
options in config files that can specify training, validation, test dataset folders
optionally, the multi-folder support can be extended to validation and testfolders

What's the actual result?

Additional details / screenshot

Adding h5 data loaders

For more information, see #5

Data Loader Support

To facilitate the user experience, we plan to prepare some default data loaders for different use scenarios. Currently, Nifti and H5 formats are supported. For different types of use cases and image formats, a customised data loader is needed (add a link to the tutorial).

Data Format

There are some prerequisites on the data:

Data must be split into train / val / test before and stored in different directories. Although val or test data are optional.
Each image or label is in 3D. Image has shape (width, height, depth); label has shape (width, height, depth) or (width, height, depth, num_labels).
The data do not have to be of the same shape - All will be resized to the same shape before feed-in. In order to prevent unexpected effects, it may be recommended that all images are pre-processed to the desirable shape.

Supported scenarios

Unpaired images (e.g. single-modality inter-subject registration)

Case 1-1 multiple independent images.
Case 1-2 multiple independent images and corresponding labels.

Grouped unpaired images (e.g. single-modality intra-subject registration)

Case 2-1 multiple subjects each with multiple images.
Case 2-2 multiple subjects each with multiple images and corresponding labels.

Paired images (e.g. two-modality intra-subject registration)

Case 3-1 multiple paired images.
Case 3-2 multiple paired images and corresponding labels.

Sampling during training

Sampling for multiple labels

In any case when corresponding labels are available and there are multiple types of labels, e.g. the segmentation of different organs in a CT image, two options are available:

During one epoch, each image would be sampled only once and when there are multiple labels, we will randomly sample one label at a time. (Default)
During one epoch, each image would be paired with each available label. So, if an image has four types of labels, it will be sampled for four times and each time corresponds to a different label.
When using multiple labels, it is the user's responsibility to ensure the labels are ordered, such that label_idx are the corresponding types in (width, height, depth, label_idx) - the same type of landmark or ROI - between all labels

Sampling for multiple subjects each with multiple images

When multiple subjects each with multiple images are available, multiple different sampling methods are supported:

Inter-subject, one image is sampled from subject A as moving image, and another one image is sampled from a different subject B as fixed image.
Intra-subject, two images are sampled from the same subject. In this case, we can specify:
a) moving image always has a smaller index, e.g. at an earlier time;
b) moving image always has a larger index, e.g. at a later time; or
c) no constraint on the order.

For the first two options, the intra-subject images will be ascending-sorted by name to represent ordered sequential images, such as time-series data
*Multiple label sampling is also supported once image pair is sampled; In case there are no consistent label types defined between subjects, an option is available to turned off label contribution to the loss for those inter-subject image pairs.

Examples (folder structure and filename requirement)

In the following, we take train directory as an example to list how the files should be stored.

Nifti Data Format

Assuming each .nii.gz file contains only one tensor, which is either image or label.

Unpaired data

This is the simplest case. Data are assumed to be stored under train/images and train/labels directories.

Nifti Case 1-1 Images only

We only have images without any labels and all images are considered to be independent samples. So all data should be stored under train/images, e.g.:

train
- images
  - subject1.nii.gz
  - subject2.nii.gz
  - ...

(It is also ok if the data are further grouped into different directories under images as we will directly scan all nifti files under train/images.)

Nifti Case 1-2 Images with labels

In this case, we have both images and labels. So all images should be stored under train/images and all labels should be stored under train/labels. The corresponding image file name and label file name should be exactly the same, e.g.:

train
- images
  - subject1.nii.gz
  - subject2.nii.gz
  - ...
- labels
  - subject1.nii.gz
  - subject2.nii.gz
  - ...

Grouped unpaired images

Nifti Case 2-1 Images only

We have images without any labels, but images are grouped under different subjects/groups, e.g. time-series observations for each subject/group. For instance, the data set can be the CT scans of multiple patients (subjects/groups) where each patient has multiple scans acquired at different time points. So all data should be stored under train/images and the leaf directories (directories that do not have sub-directories) must represent different subjects/groups, e.g.:

train
- images
  - subject1
    - obs1.nii.gz
    - obs2.nii.gz
    - ...
  - subject2
    - obs1.nii.gz
    - obs2.nii.gz
    - ...
  - ...

(It is also ok if the data are grouped into different directories, but the leaf directories will be considered as different subjects/groups.)

Nifti Case 2-2 Images with labels

We have both images and labels. So all images should be stored under train/images and all labels should be stored under train/labels. The leaf directories will be considered as different subjects/groups and the corresponding image file name and label file name should be exactly the same, e.g.:

train
- images
  - subject1
    - obs1.nii.gz
    - obs2.nii.gz
    - ...
  - ...
- labels
  - subject1
    - obs1.nii.gz
    - obs2.nii.gz
    - ...
  - ...

Paired images

In this case, images are paired, for example, to represent a multimodal moving and fixed image pairs to register. Data are assumed to be stored under train/moving_images, train/fixed_images, train/moving_labels, and train/fixed_labels directories.

Nifti Case 3-1 Images only

We only have paired images without any labels. So all data should be stored under train/moving_images, train/fixed_images and the images corresponding to the same subject should have exactly the same name, e.g.:

train
- moving_images
  - subject1.nii.gz
  - subject2.nii.gz
  - ...
- fixed_images
  - subject1.nii.gz
  - subject2.nii.gz
  - ...

(It is ok if the data are further grouped into different directories under train/moving_images and train/fixed_images as we will directly scan all nifti files under them.)

Nifti Case 3-2 Images with labels

We have both images and labels. So all data should be stored under train/moving_images, train/fixed_images, train/moving_labels, and train/fixed_labels . The images and labels corresponding to the same subjects/groups should have exactly the same names, e.g.:

train
- moving_images
  - subject1.nii.gz
  - subject2.nii.gz
  - ...
- fixed_images
  - subject1.nii.gz
  - subject2.nii.gz
  - ...
- moving_labels
  - subject1.nii.gz
  - subject2.nii.gz
  - ...
- fixed_labels
  - subject1.nii.gz
  - subject2.nii.gz
  - ...

H5 Data Format

Each .h5 file is similar to a dictionary, having multiple key-value pairs. Hierarchical multi-level h5 indexing is not used. Each value is either image or label.

Unpaired images

H5 Case 1-1 Images only

Each key corresponds to one image, e.g. {"subject1": data1, "subject2": data1, ...}. All data should be stored under train/images, it can be a single h5 file or multiple h5 files e.g.:

train
- images
  - part1.h5
  - part2.h5
  - ...

H5 Case 1-2 Images with labels

Each key corresponds to one subject. Data can be stored in two single h5 files (one for image and one for label), the keys in the files should be the same.

train
- images
  - data.h5 (keys = ["subject1", "subject2", ...])
- labels
  - data.h5 (keys = ["subject1", "subject2", ...])

Grouped unpaired images

H5 Case 2-1 Images only

Similar to case 1-1 above, but the keys, in this case, have to share the same format like subject%d-%d where %d represents a number. For instance, subject3-2 corresponds to the second observation for the subjects. Otherwise, the file structure is the same as case 1-1, e.g.

train
- images
  - part1.h5 (keys = ["subject1-1", "subject1-2", "subject2-1", ...])
  - part2.h5
  - ...

H5 Case 2-2 Images with labels

Similar to case 1-2 and 2-1 above, the keys have to share the same format like subject%d-%d and the keys for images and labels should be consistent.

train
- images
  - part1.h5 (keys = ["subject1-1", "subject1-2", ...])
  - part2.h5 (keys = ["subject101-1", "subject101-2", ...])
  - ...
- labels
  - part1.h5 (keys = ["subject1-1", "subject1-2", ...])
  - part2.h5 (keys = ["subject101-1", "subject101-2", ...])
  - ...

Paired images

In this case, data are paired. Data are assumed to be stored under train/moving_images, train/fixed_images, train/moving_labels, and train/fixed_labels directories.

H5 Case 3-1 Images only

We only have paired images without any labels. So all data should be stored under train/moving_images, train/fixed_images and the keys corresponding to the same subject should be the same, e.g.:

train
- moving_images
  - part1.h5 (keys = ["subject1", "subject2", ...])
  - part2.h5
  - ...
- fixed_images
  - part1.h5 (keys = ["subject1", "subject2", ...])
  - part2.h5
  - ...

H5 Case 3-2 Images with labels

We have both images and labels. So all data should be stored under train/moving_images, train/fixed_images, train/moving_labels, and train/fixed_labels. The keys corresponding to the same subject should be the same, e.g.:

train
- moving_images
  - data.h5 (keys = ["subject1", "subject2", ...])
- fixed_images
  - data.h5 (keys = ["subject1", "subject2", ...])
- moving_labels
  - data.h5 (keys = ["subject1", "subject2", ...])
- fixed_labels
  - data.h5 (keys = ["subject1", "subject2", ...])

Listing the scope of the code in the main README

It's good practice to explain in the README what the code is about before going into how to use it

Community guidelines

https://joss.readthedocs.io/en/latest/review_checklist.html

Community guidelines
There should be clear guidelines for third-parties wishing to:

Contribute to the software - see #19
Report issues or problems with the software - create bug/issue template.
Seek support - create code of conduct.

Create contribution guidelines

Demo - Logitudinal MR Prostate

tbc

Tutorial - predefined data loader

The detailed information how to use predefined data loaders
add predefined_loader.md in /tutorials
add link to the wiki Tutorial Index

The current format is summarised as follows:

Data Loader Support

Data Format

There are some prerequisites on the data:

Data must be split into train / val / test before and stored in different directories. Although val or test data are optional.
Each image or label is in 3D. Image has shape (width, height, depth); label has shape (width, height, depth) or (width, height, depth, num_labels).
The data do not have to be of the same shape - All will be resized to the same shape before feed-in. In order to prevent unexpected effects, it may be recommended that all images are pre-processed to the desirable shape.

Supported scenarios

Unpaired images (e.g. single-modality inter-subject registration)

Case 1-1 multiple independent images.
Case 1-2 multiple independent images and corresponding labels.

Grouped unpaired images (e.g. single-modality intra-subject registration)

Case 2-1 multiple subjects each with multiple images.
Case 2-2 multiple subjects each with multiple images and corresponding labels.

Paired images (e.g. two-modality intra-subject registration)

Case 3-1 multiple paired images.
Case 3-2 multiple paired images and corresponding labels.

Examples (folder structure and filename requirement)

In the following, we take train directory as an example to list how the files should be stored.

Nifti Data Format

Assuming each .nii.gz file contains only one tensor, which is either image or label.

Unpaired data

This is the simplest case. Data are assumed to be stored under train/images and train/labels directories.

Nifti Case 1-1 Images only

We only have images without any labels and all images are considered to be independent samples. So all data should be stored under train/images, e.g.:

train
- images
  - subject1.nii.gz
  - subject2.nii.gz
  - ...

(It is also ok if the data are further grouped into different directories under images as we will directly scan all nifti files under train/images.)

Nifti Case 1-2 Images with labels

train
- images
  - subject1.nii.gz
  - subject2.nii.gz
  - ...
- labels
  - subject1.nii.gz
  - subject2.nii.gz
  - ...

Grouped unpaired images

Nifti Case 2-1 Images only

train
- images
  - subject1
    - obs1.nii.gz
    - obs2.nii.gz
    - ...
  - subject2
    - obs1.nii.gz
    - obs2.nii.gz
    - ...
  - ...

(It is also ok if the data are grouped into different directories, but the leaf directories will be considered as different subjects/groups.)

Nifti Case 2-2 Images with labels

train
- images
  - subject1
    - obs1.nii.gz
    - obs2.nii.gz
    - ...
  - ...
- labels
  - subject1
    - obs1.nii.gz
    - obs2.nii.gz
    - ...
  - ...

Paired images

Nifti Case 3-1 Images only

train
- moving_images
  - subject1.nii.gz
  - subject2.nii.gz
  - ...
- fixed_images
  - subject1.nii.gz
  - subject2.nii.gz
  - ...

(It is ok if the data are further grouped into different directories under train/moving_images and train/fixed_images as we will directly scan all nifti files under them.)

Nifti Case 3-2 Images with labels

train
- moving_images
  - subject1.nii.gz
  - subject2.nii.gz
  - ...
- fixed_images
  - subject1.nii.gz
  - subject2.nii.gz
  - ...
- moving_labels
  - subject1.nii.gz
  - subject2.nii.gz
  - ...
- fixed_labels
  - subject1.nii.gz
  - subject2.nii.gz
  - ...

H5 Data Format

Each .h5 file is similar to a dictionary, having multiple key-value pairs. Hierarchical multi-level h5 indexing is not used. Each value is either image or label.

Unpaired images

H5 Case 1-1 Images only

Each key corresponds to one image, e.g. {"subject1": data1, "subject2": data1, ...}. All data should be stored under train/images, it can be a single h5 file or multiple h5 files e.g.:

train
- images
  - part1.h5
  - part2.h5
  - ...

H5 Case 1-2 Images with labels

Each key corresponds to one subject. Data can be stored in two single h5 files (one for image and one for label), the keys in the files should be the same.

train
- images
  - data.h5 (keys = ["subject1", "subject2", ...])
- labels
  - data.h5 (keys = ["subject1", "subject2", ...])

Grouped unpaired images

H5 Case 2-1 Images only

train
- images
  - part1.h5 (keys = ["subject1-1", "subject1-2", "subject2-1", ...])
  - part2.h5
  - ...

H5 Case 2-2 Images with labels

Similar to case 1-2 and 2-1 above, the keys have to share the same format like subject%d-%d and the keys for images and labels should be consistent.

train
- images
  - part1.h5 (keys = ["subject1-1", "subject1-2", ...])
  - part2.h5 (keys = ["subject101-1", "subject101-2", ...])
  - ...
- labels
  - part1.h5 (keys = ["subject1-1", "subject1-2", ...])
  - part2.h5 (keys = ["subject101-1", "subject101-2", ...])
  - ...

Paired images

In this case, data are paired. Data are assumed to be stored under train/moving_images, train/fixed_images, train/moving_labels, and train/fixed_labels directories.

H5 Case 3-1 Images only

We only have paired images without any labels. So all data should be stored under train/moving_images, train/fixed_images and the keys corresponding to the same subject should be the same, e.g.:

train
- moving_images
  - part1.h5 (keys = ["subject1", "subject2", ...])
  - part2.h5
  - ...
- fixed_images
  - part1.h5 (keys = ["subject1", "subject2", ...])
  - part2.h5
  - ...

H5 Case 3-2 Images with labels

train
- moving_images
  - data.h5 (keys = ["subject1", "subject2", ...])
- fixed_images
  - data.h5 (keys = ["subject1", "subject2", ...])
- moving_labels
  - data.h5 (keys = ["subject1", "subject2", ...])
- fixed_labels
  - data.h5 (keys = ["subject1", "subject2", ...])

JOSS paper

with a paper format
https://joss.theoj.org/

Tutorial - Setup

The detailed information on dependency, installation etc
add setup.md in /tutorials
add link to the wiki Tutorial Index

Linting Codebase to PEP8 - linting deepreg/model/layer_util.py

Editing repo without changing functionality to set up a consistent coding style.

Generate pylintrc
Pre-lint codebase prior to setting up auto lint/pylint in CI

Related issue #10

Tutorial - get started with registration using deep learning

Describe the basics in training registration:

unsupervised learning
weakly supervised learning
combininig unsupervised loss and weak supervisions
conditional segmentation

with data loader and links to demos

Remove nilearn dependency

Issue description

Remove nilearn dependency

## Type of issue

Improvement of code

Test + docs: layer_utils.py

Issue description

We do not have any unit tests for layer utils. These are base functions that are relied on throughout the package. Good place to start for unit tests as we will rely on these functions working as expected for further testing down the line.

## Type of issue
Please delete options that are not relevant.

For bug report or feature request, steps to reproduce the issue, including (but not limited to):

N/A

What's the expected result?

Unit tests which prove the functionality of the following functions:

What's the actual result?

Unit test(s), pytest style, for each function that assert inputs, outputs for a couple of cases each.

Additional details / screenshot

move the demo data to a separate online storage

Demo - MR-US Prostate (label-reg)

MR-to-ultrasound registration
data and previous demo: https://github.com/YipengHu/label-reg

This demo tests:
1 - weakly supervision
2 - conditional segmentation
and possibly
3 - different data formats?

Tutorial - experiment design

Examples of random-split and cross-validation
add experiment.md in /tutorials
add link to the wiki Tutorial Index

Porting the niftyreg 3d resampler

Discuss the feasibility and pros/cons for using the higher-order and potentially faster and memory-saving resampler:
https://github.com/NifTK/NiftyNet/tree/dev/niftynet/contrib/niftyreg_image_resampling

Please comment below.

Demo - CT Lung (learn2reg t2)

CT Lung Registration
data: https://zenodo.org/record/3835682#.XuVJ-0BFzx8
https://learn2reg.grand-challenge.org/Dataset/

This demo can test:
1 - unpaired inter-subject reg with/without labels
2 - unpaired intra-subject reg with/without labels
and potentially
3 - paired intra-subject reg with/without labels

Update README: broken link to contributing docs in README, outdated instructions for package usage, commit style guide.

Issue description

The link to contributions doc is broken - ideally update this so it is not broken anymore.
The instructions for utilisation of the package using argparse could be simplified
Our commit messages are a bit all over the place making it hard to get the gist of updates from commits, implement a commit contrib style guide in CONTRIBUTIONS.
The issue template does not enforce md style of "type of issue". Fix this.
For setup, we have discarded requirements.txt so need to remove this from the readme to avoid confusion.

Type of issue

Please delete options that are not relevant.

For bug report or feature request, steps to reproduce the issue, including (but not limited to):

N/A

What's the expected result?

Link no longer broken - direct to master file of CONTRIBUTIONS.md
Reduced, simpler instructions for argparse
Uniform guide in writing for better guidance and more uniform history in dev.
Appropriate md style for title in "Type of issue".
Remove confusing "pip install requirements.txt" when requirements are held in the setup file in the next instruction.

What's the actual result?

404 Error when clicking on README to the CONTRIBUTIONS.md
N/A
Commit instructions a bit vague - be more explicit in wanted style to facilitate dev.
Literal "# # Type of Issue" returned in md formatting.
We have an instruction for "install requirements.txt" but no txt file in the repo, which is confusing and should be addressed.

Additional details / screenshot

N/A

README missing CLI information and broken links to docs to use CLI

Issue description

New merge from issue #36 #74 has added several broken links to the README
The information provided for the CLI is therefore not accessible as the information was removed from the README and the link from README to the docs is broken.
It is not clear what exactly the CLI interfaces "do" - what are the arguments necessary? It is fine to point a type needed but that won't help the end user understand from the README how to use it. eg. could be anything "a" vs a full path?
Q. what is the reason behind removing gentfrecords application from readme?

Type of Issue

Please delete options that are not relevant.

Bug report
Documentation update

For bug report or feature request, steps to reproduce the issue, including (but not limited to):

Click on the links "How to configure deepreg options" leads to a 404.

What's the expected result?

The links pointing to full docs
Move the link to the CLI so that it is clear where more information on the CLI can be accessed: "to get more information click here" directly under the code examples.
Re-introduce argument descriptors (?). Otherwise, get rid of CLI app on README as it is a confusing feature now.

What's the actual result?

Clicking on the broken links on README on master leads to a 404 error.

Additional details / screenshot

Automated code formatting

It's good practice to follow some code formating standards. A good means of doing so is to use an automated code formatter such as black.

See e.g. how it's done in MONAI: https://github.com/Project-MONAI/MONAI/blob/master/CONTRIBUTING.md#automatic-code-formatting

Config file in predict automatically taken from ckpt path - assuming too much of a strict naming convention/folder structure?

Issue description

In deepreg/predict, there is no reliance on a config file - instead, it is picked up from the ckpt path and assumes a similar file structure like:

config = config_parser.load("/".join(ckpt_path.split("/")[:-2]) + "/config.yaml")
found at:
https://github.com/ucl-candi/DeepReg/blob/5010a17f529d5b3c7cd6d7b4d5c73bdc3a2e389f/deepreg/predict.py#L176

This seems like a fragile patch that might benefit from modifying such that users can pass their own config file? Or is there some reason I don't understand for the design choice?

First I would like to understand the above, and if needs corrected we can assign/create a better way to access config?

## Type of issue
Please delete options that are not relevant.

[] Bug report
New feature request
Documentation update
Test request
Linting request

For bug report or feature request, steps to reproduce the issue, including (but not limited to):

N/A

What's the expected result?

Users pass config files related to their checkpoints manually such that the config file isn't taken from ckpt path.

What's the actual result?

Currently, the tests for this pass when running predict and train but I'm not entirely sure about the functionality of taking the path for config from ckpt string (it seems a bit strict).