Git Product home page Git Product logo

multiplanarunet's Introduction

Multi-Planar U-Net

Implementation of the Multi-Planar U-Net as described in:

Mathias Perslev, Erik Dam, Akshay Pai, and Christian Igel. One Network To Segment Them All: A General, Lightweight System for Accurate 3D Medical Image Segmentation. In: Medical Image Computing and Computer Assisted Intervention (MICCAI), 2019

Pre-print version: https://arxiv.org/abs/1911.01764

Published version: https://doi.org/10.1007/978-3-030-32245-8_4#


Other publications

The Multi-Planar U-Net as implemented here was also used in the following context(s):

  • The International Workshop on Osteoarthritis Imaging Knee MRI Segmentation Challenge, described in https://arxiv.org/abs/2004.14003. Data supporting our team's contribution may be found here (hyperparameter files, parameter files, test-set predictions etc.).

Quick Start

Installation

# From GitHub
git clone https://github.com/perslev/MultiPlanarUNet
pip install -e MultiPlanarUNet

This package is still frequently updated and it is thus recommended to install the package with PIP with the -e ('editable') flag so that the package can be updated with recent changes on GitHub without re-installing:

cd MultiPlanarUNet
git pull

However, the package is also occasionally updated on PyPi for install with:

# Note: renamed MultiPlanarUNet -> mpunet in versions 0.2.4
pip install mpunet

Usage

usage: mp [script] [script args...]

Multi-Planar UNet (0.1.0)
-------------------------
Available scripts:
- cv_experiment
- cv_split
- init_project
- predict
- predict_3D
- summary
- train
- train_fusion
...

Overview

This package implements fully autonomous deep learning based segmentation of any 3D medical image. It uses a fixed hyperparameter set and a fixed model topology, eliminating the need for conducting hyperparameter tuning experiments. No manual involvement is required except for supplying the training data.

The system has been evaluated on a wide range of tasks covering various organ and pathology segmentation tasks, tissue types, and imaging modalities. The model obtained a top-5 position at the 2018 Medical Segmentation Decathlon (http://medicaldecathlon.com/) despite its simplicity and computational efficiency.

This software may be used as-is and does not require deep learning expertise to get started. It may also serve as a strong baseline method for general purpose semantic segmentation of medical images.

Method

The base model is a slightly modified 2D U-Net (https://arxiv.org/abs/1505.04597) trained under a multi-planar framework. Specifically, the 2D model is fed images sampled across multiple views onto the image volume simultaneously:

Multi-Planar Animation

At test-time, the model predict along each of the views and recreates a set of full segmentation volumes. These volumes are fused into one using a learned function that weights each class from each view individually to maximise the performance.

Usage

Project initialization, model training, evaluation, prediction etc. can be performed using the scripts located in MultiPlanarUNet.bin. The script named mp.py serves as an entry point to all other scripts, and it is used as follows:

# Invoke the help menu
mp --help

# Launch the train script
mp train [arguments passed to 'train'...]

# Invoke the help menu of a sub-script
mp train --help

You only need to specify the training data in the format described below. Training, evaluation and prediction will be handled automatically if using the above scripts.

Preparing the data

In order to train a model to solve a specific task, a set of manually annotated images must be stored in a folder under the following structure:

./data_folder/
|- train/
|--- images/
|------ image1.nii.gz
|------ image5.nii.gz
|--- labels/
|------ image1.nii.gz
|------ image5.nii.gz
|- val/
|--- images/
|--- labels/
|- test/
|--- images/
|--- labels/
|- aug/ <-- OPTIONAL
|--- images/
|--- labels/

The names of these folders may be customized in the parameter file (see below), but default to those shown above. The image and corresponding label map files must be identically named.

The aug folder may store additional images that can be included during training with a lower weight assigned in optimization.

File formatting

All images must be stored in the .nii/.nii.gz format. It is important that the .nii files store correct 4x4 affines for mapping voxel coordinates to the scanner space. Specifically, the framework needs to know the voxel size and axis orientations in order to sample isotrophic images in the scanner space.

Images should be arrays of dimension 4 with the first 3 corresponding to the image dimensions and the last the channels dimension (e.g. [256, 256, 256, 3] for a 256x256x256 image with 3 channels). Label maps should be identically shaped in the first 3 dimensions and have a single channel (e.g. [256, 256, 256, 1]). The label at a given voxel should be an integer representing the class at the given position. The background class is normally denoted '0'.

Initializing a Project

Once the data is stored under the above folder structure, a Multi-Planar project can be initialized as follows:

# Initialize a project at 'my_folder'
# The --data_dir flag is optional
mp init_project --name my_project --data_dir ./data_folder

This will create a folder at path my_project and populate it with a YAML file named train_hparams.yaml, which stores all hyperparameters. Any parameter in this file may be specified manually, but can all be set automatically.

NOTE: By default the init_project prepares a Multi-Planar model. However, note that a 3D model is also supported, which can be selected by specifying the --model=3D flag (default=---model=MultiPlanar).

Training

The model can now be trained as follows:

mp train --num_GPUs=2   # Any number of GPUs (or 0)

During training various information and images will be logged automatically to the project folder. Typically, after training, the folder will look as follows:

./my_project/
|- images/               # Example segmentations through training
|- logs/                 # Various log files
|- model/                # Stores the best model parameters
|- tensorboard/          # TensorBoard graph and metric visualization
|- train_hparams.yaml    # The hyperparameters file
|- views.npz             # An array of the view vectors used
|- views.png             # Visualization of the views used

Fusion Model Training

When using the MultiPlanar model, a fusion model must be computed after the base model has been trained. This model will learn to map the multiple predictions of the base model through each view to one, stronger segmentation volume:

mp train_fusion --num_GPUs=2

Predict and evaluate

The trained model can now be evaluated on the testing data in data_folder/test by invoking:

mp predict --num_GPUs=2 --out_dir predictions

This will create a folder my_project/predictions storing the predicted images along with dice coefficient performance metrics.

The model can also be used to predict on images stored in the predictions folder but without corresponding label files using the --no_eval flag or on single files as follows:

# Predict on all images in 'test' folder without label files
mp predict --no_eval

# Predict on a single image
mp predict -f ./new_image.nii.gz

# Preidct on a single image and do eval against its label file
mp predict -f ./im/new_image.nii.gz -l ./lab/new_image.nii.gz

Performance Summary

A summary of the performance can be produced by invoking the following command from inside the my_project folder or predictions sub-folder:

mp summary

>> [***] SUMMARY REPORT FOR FOLDER [***]
>> ./my_project/predictions/csv/
>> 
>> 
>> Per class:
>> --------------------------------
>>    Mean dice by class  +/- STD    min    max   N
>> 1               0.856    0.060  0.672  0.912  34
>> 2               0.891    0.029  0.827  0.934  34
>> 3               0.888    0.027  0.829  0.930  34
>> 4               0.802    0.164  0.261  0.943  34
>> 5               0.819    0.075  0.552  0.926  34
>> 6               0.863    0.047  0.663  0.917  34
>> 
>> Overall mean: 0.853 +- 0.088
>> --------------------------------
>> 
>> By views:
>> --------------------------------
>> [0.8477811  0.50449719 0.16355361]          0.825
>> [ 0.70659414 -0.35532932  0.6119361 ]       0.819
>> [ 0.11799461 -0.07137918  0.9904455 ]       0.772
>> [ 0.95572575 -0.28795306  0.06059151]       0.827
>> [-0.16704373 -0.96459936  0.20406974]       0.810
>> [-0.72188903  0.68418977  0.10373322]       0.819
>> --------------------------------

Cross Validation Experiments

Cross validation experiments may be easily performed. First, invoke the mp cv_split command to split your data_folder into a number of random splits:

mp cv_split --data_dir ./data_folder --CV=5

Here, we prepare for a 5-CV setup. By default, the above command will create a folder at data_folder/views/5-CV/ storing in this case 5 folders split0, split1, ..., split5 each structured like the main data folder with sub-folders train, val, test and aug (optionally, set with the --aug_sub_dir flag). Inside these sub-folders, images a symlinked to their original position to safe storage.

Running a CV Experiment

A cross-validation experiment can now be performed. On systems with multiple GPUs, each fold can be assigned a given number of the total pool of GPUs'. In this case, multiple folds will run in parallel and new ones automatically start when previous folds terminate.

First, we create a new project folder. This time, we do not specify a data folder yet:

mp init_project --name CV_experiment

We also create a file named script, giving the following folder structure:

./CV_experiment
|- train_hparams.yaml
|- script

The train_hparams.yaml file will serve as a template that will be applied to all folds. We can set any parameters we want here, or let the framework decide on proper parameters for each fold automatically. The script file details the mp commands (and optionally various arguments) to execute on each fold. For instance, a script file may look like:

mp train --no_images  # Do not save example segmentations
mp train_fusion
mp predict --out_dir predictions

We can now execute the 5-CV experiment by running:

mp cv_experiment --CV_dir=./data_dir/views/5-CV \
                 --out_dir=./splits \
                 --num_GPUs=2
                 --monitor_GPUs_every=600

Above, we assign 2 GPUs to each fold. On a system of 8 GPUs, 4 folds will be run in parallel. We set --monitor_GPUs_every=600 to scan the system for new free GPU resources every 600 seconds (otherwise, only GPUs that we initially available will be cycled and new free ones will be ignored).

The cv_experiment script will create a new project folder for each split located at --out_dir (CV_experiment/splits in this case). For each fold, each of the commands outlined in the script file will be launched one by one inside the respective project folder of the fold, so that the predictions are stored in CV_experiment/splits/split0/predictions for fold 0 etc.

Afterwards, we may get a CV summary by invoking:

mp summary

... from inside the CV_experiment/splits folder.

multiplanarunet's People

Contributors

christian-igel avatar perslev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

multiplanarunet's Issues

No available GPUs... Sleeping 120 seconds

installed mpunet vi pip and and trying to run it on my data.
However, I get the error:
No available GPUs... Sleeping 120 seconds

os.environ.get("CUDA_VISIBLE_DEVICES") returns None.

Must I add my GPU to CUDA_VISIBLE_DEVICES for mpunet to work?
I have a NVIDIA GeForce RTX 2080 Ti.

Tensorflow sees 1 GPU.

>>> print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
2021-08-06 13:37:59.672714: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll
2021-08-06 13:37:59.728407: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.755GHz coreCount: 68 deviceMemorySize: 11.00GiB deviceMemoryBandwidth: 573.69GiB/s
2021-08-06 13:37:59.729019: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2021-08-06 13:38:00.296442: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2021-08-06 13:38:00.724247: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2021-08-06 13:38:00.764548: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2021-08-06 13:38:01.082499: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2021-08-06 13:38:01.355622: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2021-08-06 13:38:01.705136: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2021-08-06 13:38:01.706373: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
Num GPUs Available:  1

When training on toy_data, TypeError: can only concatenate list (not "dict") to list

Running mp train --just_one --overwrite on a toy project with data generated using mp toy_data --out_dir ./toy_data leads to TensorFlow reporting a TypeError:

Epoch 1/500 WARNING:tensorflow:From /home/akshay/.local/lib/python3.6/site-packages/tensorflow/python/ops/math_grad.py:1250: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.where in 2.0, which has the same broadcast rule as np.where 2019-11-12 14:46:01.034121: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7 2019-11-12 14:46:11.320775: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235 pciBusID: 2447:00:00.0 2019-11-12 14:46:11.320857: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0 2019-11-12 14:46:11.320885: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0 2019-11-12 14:46:11.320909: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0 2019-11-12 14:46:11.320932: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0 2019-11-12 14:46:11.320954: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0 2019-11-12 14:46:11.320976: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0 2019-11-12 14:46:11.321000: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7 2019-11-12 14:46:11.321705: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0 2019-11-12 14:46:11.323352: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235 pciBusID: 2447:00:00.0 2019-11-12 14:46:11.323398: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0 2019-11-12 14:46:11.323424: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0 2019-11-12 14:46:11.323446: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0 2019-11-12 14:46:11.323468: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0 2019-11-12 14:46:11.323490: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0 2019-11-12 14:46:11.323512: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0 2019-11-12 14:46:11.323534: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7 2019-11-12 14:46:11.324196: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0 2019-11-12 14:46:11.324233: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-11-12 14:46:11.324247: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0 2019-11-12 14:46:11.324255: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N 2019-11-12 14:46:11.324970: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10805 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 2447:00:00.0, compute capability: 3.7) 2019-11-12 14:46:11.325972: I tensorflow/core/profiler/lib/profiler_session.cc:174] Profiler session started. 2019-11-12 14:46:11.341159: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcupti.so.10.0 1/157 [..............................] - ETA: 49:50 - loss: 0.49752019-11-12 14:46:13.153506: I tensorflow/core/platform/default/device_tracer.cc:641] Collecting 4204 kernel records, 123 memcpy records. 156/157 [============================>.] - ETA: 1s - loss: 0.2752can only concatenate list (not "dict") to list Traceback (most recent call last): File "/usr/local/bin/mp", line 11, in <module> load_entry_point('MultiPlanarUNet==0.2.2', 'console_scripts', 'mp')() File "/usr/local/lib/python3.6/dist-packages/MultiPlanarUNet-0.2.2-py3.6.egg/MultiPlanarUNet/bin/mp.py", line 55, in entry_func mod.entry_func(parsed.args) File "/usr/local/lib/python3.6/dist-packages/MultiPlanarUNet-0.2.2-py3.6.egg/MultiPlanarUNet/bin/train.py", line 398, in entry_func raise e File "/usr/local/lib/python3.6/dist-packages/MultiPlanarUNet-0.2.2-py3.6.egg/MultiPlanarUNet/bin/train.py", line 394, in entry_func run(project_dir=project_dir, gpu_mon=gpu_mon, logger=logger, args=args) File "/usr/local/lib/python3.6/dist-packages/MultiPlanarUNet-0.2.2-py3.6.egg/MultiPlanarUNet/bin/train.py", line 358, in run hparams=hparams, no_im=args.no_images, **hparams["fit"]) File "/usr/local/lib/python3.6/dist-packages/MultiPlanarUNet-0.2.2-py3.6.egg/MultiPlanarUNet/train/trainer.py", line 122, in fit raise e File "/usr/local/lib/python3.6/dist-packages/MultiPlanarUNet-0.2.2-py3.6.egg/MultiPlanarUNet/train/trainer.py", line 107, in fit val_ignore_class_zero) File "/usr/local/lib/python3.6/dist-packages/MultiPlanarUNet-0.2.2-py3.6.egg/MultiPlanarUNet/train/trainer.py", line 215, in _fit_loop self.model.fit_generator(**fit_kwargs) File "/home/akshay/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1433, in fit_generator steps_name='steps_per_epoch') File "/home/akshay/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_generator.py", line 331, in model_iteration callbacks.on_epoch_end(epoch, epoch_logs) File "/home/akshay/.local/lib/python3.6/site-packages/tensorflow/python/keras/callbacks.py", line 311, in on_epoch_end callback.on_epoch_end(epoch, logs) File "/usr/local/lib/python3.6/dist-packages/MultiPlanarUNet-0.2.2-py3.6.egg/MultiPlanarUNet/callbacks/callbacks.py", line 416, in on_epoch_end TPs, relevant, selected, metrics = self.predict() File "/usr/local/lib/python3.6/dist-packages/MultiPlanarUNet-0.2.2-py3.6.egg/MultiPlanarUNet/callbacks/callbacks.py", line 318, in predict tensors = [self.model.total_loss] + metrics_tensors + self.model.outputs TypeError: can only concatenate list (not "dict") to list

I continued with mp train_fusion, under the assumption that the model was trained, but ran into an ImportError:
Traceback (most recent call last): File "/usr/local/bin/mp", line 11, in <module> load_entry_point('MultiPlanarUNet==0.2.2', 'console_scripts', 'mp')() File "/usr/local/lib/python3.6/dist-packages/MultiPlanarUNet-0.2.2-py3.6.egg/MultiPlanarUNet/bin/mp.py", line 52, in entry_func mod = importlib.import_module("MultiPlanarUNet.bin." + script) File "/usr/lib/python3.6/importlib/__init__.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 994, in _gcd_import File "<frozen importlib._bootstrap>", line 971, in _find_and_load File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 665, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 678, in exec_module File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed File "/usr/local/lib/python3.6/dist-packages/MultiPlanarUNet-0.2.2-py3.6.egg/MultiPlanarUNet/bin/train_fusion.py", line 26, in <module> from MultiPlanarUNet.train import YAMLHParams ImportError: cannot import name 'YAMLHParams'

Ubuntu 18.04.3 LTS (GNU/Linux 5.0.0-1023-azure x86_64)
Cuda V10.0.130
CuDNN V7
tensorflow-gpu 1.14.0 (binary)
Nvidia Tesla K80 (12Gb)

Why do we need 'predict_single' in predict.py?

Hi Mathias,

I think it's a really cool work, both the ideas and codes. I just have one confusion about a small part of the codes in predict.py.

in the run_predictions_and_eval function, before iterating images from image_pair_dict, why do we need to run predict_single over image_pair_loader.images[0], at line 425? That just seems to be a useless step because you iterate over all images in the for loop right after it?

Thanks in advance!

Clallbacks: Cannot feed value of shape (16, 128, 128, 1) for Tensor 'conv2d/truediv:0', which has shape '(?, 128, 128, 3)'

The project is much beneficial for me! Howeverm I got an error when traning at the code (outs = sess.run(tensors, feed_dict=ins)) in the file callbacks.py. The detail is following:

1/157 [..............................] - ETA: 29:39 - loss: 0.17492019-10-28 20:54:58.429846: I tensorflow/core/platform/default/device_tracer.cc:641] Collecting 1361 kernel records, 15 memcpy records.
156/157 [============================>.] - ETA: 0s - loss: 0.0912<class 'list'>
<class 'dict'>
self.model.total_loss: Tensor("loss_1/mul:0", shape=(), dtype=float32)
metrics_tensors: {}
self.model.outputs: [<tf.Tensor 'conv2d/truediv:0' shape=(?, 128, 128, 3) dtype=float32>]

W1028 20:55:51.747634 140053831063296 deprecation_wrapper.py:119] From /home/taoh/Project/DeepLearning/Examples/AwesomeExLearning/MultiPlanarUNet-master/MultiPlanarUNet/callbacks/callbacks.py:334: The name tf.keras.backend.get_session is deprecated. Please use tf.compat.v1.keras.backend.get_session instead.

tensors: [<tf.Tensor 'loss_1/mul:0' shape=() dtype=float32>, {}, <tf.Tensor 'conv2d/truediv:0' shape=(?, 128, 128, 3) dtype=float32>]
Cannot feed value of shape (16, 128, 128, 1) for Tensor 'conv2d/truediv:0', which has shape '(?, 128, 128, 3)'
Traceback (most recent call last):
File "/home/taoh/.vscode/extensions/ms-python.python-2019.9.34911/pythonFiles/ptvsd_launcher.py", line 43, in
main(ptvsdArgs)
File "/home/taoh/.vscode/extensions/ms-python.python-2019.9.34911/pythonFiles/lib/python/ptvsd/main.py", line 432, in main
run()
File "/home/taoh/.vscode/extensions/ms-python.python-2019.9.34911/pythonFiles/lib/python/ptvsd/main.py", line 316, in run_file
runpy.run_path(target, run_name='main')
File "/home/taoh/anaconda3/envs/3Dircadb_Use_Unet/lib/python3.6/runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "/home/taoh/anaconda3/envs/3Dircadb_Use_Unet/lib/python3.6/runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "/home/taoh/anaconda3/envs/3Dircadb_Use_Unet/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/taoh/Project/DeepLearning/Examples/AwesomeExLearning/MultiPlanarUNet-master/MultiPlanarUNet/bin/train.py", line 344, in
entry_func()
File "/home/taoh/Project/DeepLearning/Examples/AwesomeExLearning/MultiPlanarUNet-master/MultiPlanarUNet/bin/train.py", line 340, in entry_func
raise e
File "/home/taoh/Project/DeepLearning/Examples/AwesomeExLearning/MultiPlanarUNet-master/MultiPlanarUNet/bin/train.py", line 336, in entry_func
run(project_dir=project_dir, gpu_mon=gpu_mon, logger=logger, args=args)
File "/home/taoh/Project/DeepLearning/Examples/AwesomeExLearning/MultiPlanarUNet-master/MultiPlanarUNet/bin/train.py", line 290, in run
hparams=hparams, no_im=args.no_images, **hparams["fit"])
File "/home/taoh/Project/DeepLearning/Examples/AwesomeExLearning/MultiPlanarUNet-master/MultiPlanarUNet/train/trainer.py", line 122, in fit
raise e
File "/home/taoh/Project/DeepLearning/Examples/AwesomeExLearning/MultiPlanarUNet-master/MultiPlanarUNet/train/trainer.py", line 107, in fit
val_ignore_class_zero)
File "/home/taoh/Project/DeepLearning/Examples/AwesomeExLearning/MultiPlanarUNet-master/MultiPlanarUNet/train/trainer.py", line 215, in _fit_loop
self.model.fit_generator(**fit_kwargs)
File "/home/taoh/anaconda3/envs/3Dircadb_Use_Unet/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1433, in fit_generator
steps_name='steps_per_epoch')
File "/home/taoh/anaconda3/envs/3Dircadb_Use_Unet/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_generator.py", line 331, in model_iteration
callbacks.on_epoch_end(epoch, epoch_logs)
File "/home/taoh/anaconda3/envs/3Dircadb_Use_Unet/lib/python3.6/site-packages/tensorflow/python/keras/callbacks.py", line 311, in on_epoch_end
callback.on_epoch_end(epoch, logs)
File "/home/taoh/Project/DeepLearning/Examples/AwesomeExLearning/MultiPlanarUNet-master/MultiPlanarUNet/callbacks/callbacks.py", line 428, in on_epoch_end
TPs, relevant, selected, metrics = self.predict()
File "/home/taoh/Project/DeepLearning/Examples/AwesomeExLearning/MultiPlanarUNet-master/MultiPlanarUNet/callbacks/callbacks.py", line 353, in predict
outs = sess.run(tensors, feed_dict=ins)
File "/home/taoh/anaconda3/envs/3Dircadb_Use_Unet/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 950, in run
run_metadata_ptr)
File "/home/taoh/anaconda3/envs/3Dircadb_Use_Unet/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1149, in _run
str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (16, 128, 128, 1) for Tensor 'conv2d/truediv:0', which has shape '(?, 128, 128, 3)'

xla_gup not compatible to newest tensorflow version

the error is:
ValueError: To call multi_gpu_modelwithgpus=2, we expect the following devices to be available: ['/cpu:0', '/gpu:0', '/gpu:1']. However this machine only has: ['/cpu:0', '/xla_gpu:0', '/xla_gpu:1', '/xla_cpu:0']. Try reducing gpus.

Others on github says it is a bug of newest tf, and the flowing version will solve the above issue:
Tensorflow 2.0
CUDA 10.0
cuDNN 7.6.4 (described as dedicated for CUDA 10.0)

However, when I degrade the tf version to 2.0, the issue still exists.
Could you make you model running on these version?
Could you tell me which version should I use or update the requirements?

About the val directory

i was wondering what the val directory is and what you fill it with
giving me an error that says no image in val

can i use the same data in the training files?

Input volume is sampled on 2D

Sir, great study and I want to say thank you. but I have one issue?

Could you show where do you slice your input volume image (which code or function in the repository) into 2D images exactly?

What is real_space_span?

Hi,

The code works very well, and I could train my custom dataset with very good performance. I also think the paper is intuitively easy to read. But I just don't quite understand what the value of real_space_span indicates in train_hparams.yaml. e.g. How is it calculated from the image data? Or is it just set up with fixed number?

Thanks!

[urgent] Inputs to operation AddN_7 of type AddN must have the same size and shape.

I used the v2.5.0 and choose the loss function "SparseGeneralizedDiceLoss" and metrics "sparse_categorical_accuracy" and tried both Adam and SGD

please help, thanks!

then the error gives:

Epoch 1/25
2020-04-03 09:36:51.274232: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-04-03 09:36:53.101431: W tensorflow/stream_executor/gpu/redzone_allocator.cc:312] Not found: ./bin/ptxas not found
Relying on driver to perform ptx compilation. This message will be only logged once.
2020-04-03 09:36:53.352288: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-04-03 09:36:57.993085: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Invalid argument: Inputs to operation AddN_7 of type AddN must have the same size and shape. Input 0: [1,1] != input 2: [0,1]
[[{{node AddN_7}}]]
[[SGD/SGD/update_2_1/AssignAddVariableOp/_1607]]
2020-04-03 09:36:57.994718: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Invalid argument: Inputs to operation AddN_7 of type AddN must have the same size and shape. Input 0: [1,1] != input 2: [0,1]
[[{{node AddN_7}}]]
[[GroupCrossDeviceControlEdges_0/SGD/SGD/update_1_1/Const/_1471]]
2020-04-03 09:36:57.997156: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Invalid argument: Inputs to operation AddN_7 of type AddN must have the same size and shape. Input 0: [1,1] != input 2: [0,1]
[[{{node AddN_7}}]]
[[GroupCrossDeviceControlEdges_1/SGD/SGD/update_0/Const/_1407]]
2020-04-03 09:36:57.997288: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Invalid argument: Inputs to operation AddN_7 of type AddN must have the same size and shape. Input 0: [1,1] != input 2: [0,1]
[[{{node AddN_7}}]]
[[GroupCrossDeviceControlEdges_0/SGD/SGD/update_0/Const/_1391]]
2020-04-03 09:36:58.002164: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Invalid argument: Inputs to operation AddN_7 of type AddN must have the same size and shape. Input 0: [1,1] != input 2: [0,1]
[[{{node AddN_7}}]]
[[GroupCrossDeviceControlEdges_4/SGD/SGD/update_0/Const/_1443]]
2020-04-03 09:36:58.002986: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Invalid argument: Inputs to operation AddN_7 of type AddN must have the same size and shape. Input 0: [1,1] != input 2: [0,1]
[[{{node AddN_7}}]]
[[SGD/SGD/group_deps/_1643]]
2020-04-03 09:36:58.003135: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Invalid argument: Inputs to operation AddN_7 of type AddN must have the same size and shape. Input 0: [1,1] != input 2: [0,1]
[[{{node AddN_7}}]]
[[GroupCrossDeviceControlEdges_4/SGD/SGD/update_0/Const/_1571]]
2020-04-03 09:36:58.003238: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Invalid argument: Inputs to operation AddN_7 of type AddN must have the same size and shape. Input 0: [1,1] != input 2: [0,1]
[[{{node AddN_7}}]]
[[GroupCrossDeviceControlEdges_6/SGD/SGD/update_0/Const/_1535]]
2020-04-03 09:36:58.003819: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Invalid argument: Inputs to operation AddN_7 of type AddN must have the same size and shape. Input 0: [1,1] != input 2: [0,1]
[[{{node AddN_7}}]]
1/1250 [..............................] - ETA: 14:00:49
WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap call_for_each_replica or experimental_run or experimental_run_v2 inside a tf.function to get the best performance.
WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap call_for_each_replica or experimental_run or experimental_run_v2 inside a tf.function to get the best performance.
WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap call_for_each_replica or experimental_run or experimental_run_v2 inside a tf.function to get the best performance.
WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap call_for_each_replica or experimental_run or experimental_run_v2 inside a tf.function to get the best performance.
WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap call_for_each_replica or experimental_run or experimental_run_v2 inside a tf.function to get the best performance.

weights can not be broadcast to values. values.rank=3. weights.rank=1. values.shape=(2, 448, 448). weights.shape=(2,).

I got error while training.

Epoch 1/500
2019-11-20 14:47:08.423134: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2019-11-20 14:47:17.904216: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Not found: ./bin/ptxas not found
Relying on driver to perform ptx compilation. This message will be only logged once.
2019-11-20 14:47:24.653919: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
weights can not be broadcast to values. values.rank=3. weights.rank=1. values.shape=(2, 448, 448). weights.shape=(2,).
Traceback (most recent call last):
File "/home/zhouj0d/.conda/envs/mp/bin/mp", line 11, in
load_entry_point('MultiPlanarUNet==0.2.3', 'console_scripts', 'mp')()
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/MultiPlanarUNet-0.2.3-py3.6.egg/MultiPlanarUNet/bin/mp.py", line 55, in entry_func
mod.entry_func(parsed.args)
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/MultiPlanarUNet-0.2.3-py3.6.egg/MultiPlanarUNet/bin/train.py", line 398, in entry_func
raise e
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/MultiPlanarUNet-0.2.3-py3.6.egg/MultiPlanarUNet/bin/train.py", line 394, in entry_func
run(project_dir=project_dir, gpu_mon=gpu_mon, logger=logger, args=args)
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/MultiPlanarUNet-0.2.3-py3.6.egg/MultiPlanarUNet/bin/train.py", line 358, in run
hparams=hparams, no_im=args.no_images, **hparams["fit"])
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/MultiPlanarUNet-0.2.3-py3.6.egg/MultiPlanarUNet/train/trainer.py", line 111, in fit
raise e
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/MultiPlanarUNet-0.2.3-py3.6.egg/MultiPlanarUNet/train/trainer.py", line 96, in fit
val_ignore_class_zero)
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/MultiPlanarUNet-0.2.3-py3.6.egg/MultiPlanarUNet/train/trainer.py", line 204, in _fit_loop
self.model.fit_generator(**fit_kwargs)
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 1297, in fit_generator
steps_name='steps_per_epoch')
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_generator.py", line 265, in model_iteration
batch_outs = batch_function(*batch_data)
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 973, in train_on_batch
class_weight=class_weight, reset_metrics=reset_metrics)
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 264, in train_on_batch
output_loss_metrics=model._output_loss_metrics)
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_eager.py", line 311, in train_on_batch
output_loss_metrics=output_loss_metrics))
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_eager.py", line 252, in _process_single_batch
training=training))
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_eager.py", line 170, in _model_loss
reduction=losses_utils.ReductionV2.NONE)
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/tensorflow_core/python/keras/utils/losses_utils.py", line 107, in compute_weighted_loss
losses, sample_weight)
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/tensorflow_core/python/ops/losses/util.py", line 148, in scale_losses_by_sample_weight
sample_weight = weights_broadcast_ops.broadcast_weights(sample_weight, losses)
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/tensorflow_core/python/ops/weights_broadcast_ops.py", line 167, in broadcast_weights
with ops.control_dependencies((assert_broadcastable(weights, values),)):
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/tensorflow_core/python/ops/weights_broadcast_ops.py", line 103, in assert_broadcastable
weights_rank_static, values.shape, weights.shape))
ValueError: weights can not be broadcast to values. values.rank=3. weights.rank=1. values.shape=(2, 448, 448). weights.shape=(2,).

Error when trying to train

Hello,

I keep getting this error when trying to train in colab. I followed the steps in the readme file. I am using the following commands:

!mp init_project --name my_project --data_dir Task04_Hippocampus
!mp train --project_dir=my_project --num_GPUs=2   # Any number of GPUs (or 0)

Output

--------------------------------------------------------------------------------
>>> Logged by: 'entry_func' in 'train.py'
Fitting model in path:
/content/my_project
--------------------------------------------------------------------------------
>>> Logged by: '__init__' in 'hparams.py'
YAML path:    /content/my_project/train_hparams.yaml
--------------------------------------------------------------------------------
>>> Logged by: 'log_version' in 'version_controller.py'
mpunet version: 0.2.11 (master, a46779c)
--------------------------------------------------------------------------------
>>> Logged by: 'set_value' in 'hparams.py'
Setting value '0.2.11' (type <class 'str'>) in subdir 'None' with name '__VERSION__'
Setting value 'master' (type <class 'str'>) in subdir 'None' with name '__BRANCH__'
Setting value 'a46779c' (type <class 'str'>) in subdir 'None' with name '__COMMIT__'
--------------------------------------------------------------------------------
>>> Logged by: 'save_current' in 'hparams.py'
Saving current YAML configuration to file:
 /content/my_project/train_hparams.yaml
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py", line 568, in _build_master
    ws.require(__requires__)
  File "/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py", line 886, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py", line 777, in resolve
    raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (numpy 1.19.5 (/usr/local/lib/python3.7/dist-packages), Requirement.parse('numpy<1.19.0,>=1.16.0'), {'tensorflow-gpu'})

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/mp", line 33, in <module>
    sys.exit(load_entry_point('mpunet', 'console_scripts', 'mp')())
  File "/content/MultiPlanarUNet/mpunet/bin/mp.py", line 55, in entry_func
    mod.entry_func(parsed.args)
  File "/content/MultiPlanarUNet/mpunet/bin/train.py", line 416, in entry_func
    raise e
  File "/content/MultiPlanarUNet/mpunet/bin/train.py", line 412, in entry_func
    run(project_dir=project_dir, gpu_mon=gpu_mon, logger=logger, args=args)
  File "/content/MultiPlanarUNet/mpunet/bin/train.py", line 344, in run
    args=args)
  File "/content/MultiPlanarUNet/mpunet/bin/train.py", line 250, in get_data_sequences
    from mpunet.preprocessing import get_preprocessing_func
  File "/content/MultiPlanarUNet/mpunet/preprocessing/__init__.py", line 1, in <module>
    from .scaling import apply_scaling, get_scaler
  File "/content/MultiPlanarUNet/mpunet/preprocessing/scaling.py", line 1, in <module>
    import sklearn.preprocessing as preprocessing
  File "/usr/local/lib/python3.7/dist-packages/sklearn/__init__.py", line 82, in <module>
    from .base import clone
  File "/usr/local/lib/python3.7/dist-packages/sklearn/base.py", line 17, in <module>
    from .utils import _IS_32BIT
  File "/usr/local/lib/python3.7/dist-packages/sklearn/utils/__init__.py", line 23, in <module>
    from .class_weight import compute_class_weight, compute_sample_weight
  File "/usr/local/lib/python3.7/dist-packages/sklearn/utils/class_weight.py", line 7, in <module>
    from .validation import _deprecate_positional_args
  File "/usr/local/lib/python3.7/dist-packages/sklearn/utils/validation.py", line 26, in <module>
    from .fixes import _object_dtype_isnan, parse_version
  File "/usr/local/lib/python3.7/dist-packages/sklearn/utils/fixes.py", line 28, in <module>
    from pkg_resources import parse_version  # type: ignore
  File "/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py", line 3242, in <module>
    @_call_aside
  File "/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py", line 3226, in _call_aside
    f(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py", line 3255, in _initialize_master_working_set
    working_set = WorkingSet._build_master()
  File "/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py", line 570, in _build_master
    return cls._build_from_requirements(__requires__)
  File "/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py", line 583, in _build_from_requirements
    dists = ws.resolve(reqs, Environment())
  File "/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py", line 777, in resolve
    raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (numpy 1.19.5 (/usr/local/lib/python3.7/dist-packages), Requirement.parse('numpy<1.19.0,>=1.16.0'), {'tensorflow-gpu'})

my system restarts automatically whenever I start training

Hello, I'm using task06 Lung in MSD2018 for training, and my GPU is two nvidia3090.
I don't know why, my system restarts automatically whenever I start training.
Do you know the reason for this?
Is it because of insufficient GPU memory?
Or is it due to the size of the image being too large?

Getting ValueError in new environment

Hey Perslev

I'm setting up in a new environment with RTX Titans and seem to run into a ValueError

#mp train --num_GPUs=1

ValueError: weights can not be broadcast to values. values.rank=3. weights.rank=1. values.shape=(16, 512, 512). weights.shape=(16,).

Train_hparams.yaml
build: &BUILD

Hyperparameters passed to the Model.build and init methods

model_class_name: "UNet"
n_classes: 2
n_channels: 1
dim: 512
complexity_factor: 2
out_activation: "softmax"
l1_reg: False
l2_reg: False
biased_output_layer: True
depth: 4

fit: &FIT

Hyperparameters passed to the Trainer object

Views
views: 6
noise_sd: 0.1
real_space_span: 512.0
intrp_style: 'iso_live'

Use class weights?
class_weights: False

On-the-fly augmentation?
Leave empty or delete entirely if not
augmenters: [
{cls_name: "Elastic2D",
kwargs: {alpha: [0, 450], sigma: [20, 30], apply_prob: 0.333}}
]

Loss function
loss: "sparse_categorical_crossentropy"
metrics: []

Optimization
batch_size: 16
n_epochs: 500
verbose: true
shuffle_batch_order: true
optimizer: "Adam"
optimizer_kwargs: {lr: 5.0e-05, decay: 0.0, beta_1: 0.9, beta_2: 0.999, epsilon: 1.0e-8}

Minimum fraction of image slices with FG labels in each mini-batch
fg_batch_fraction: 0.50
bg_class: 0 # background class label
bg_value: 1pct

Normalization, using sklearn.preprocessing scalers
NOTE: Applied across full image volumes (after interpolation)
Options: MinMaxScaler, StandardScaler, MaxAbsScaler,
RobustScaler, QuantileTransformer, Null
scaler: "RobustScaler"

Callbacks
callbacks: [*RLOP, *TB, *MCP_CLEAN, *ES, *TIMER, *CSV]

VERSION: 0.2.2
BRANCH: master
COMMIT: 66d5075

Thanks in advance
Cheers!

Memory Error and Lack of GPU Usage

I'm attempting to train a Multi-planar U-Net with MPUNet 0.2.2 on an Ubuntu 18.04.3 LTS virtual machine with an Nvidia Tesla K80 GPU. Executing the command "mp train --force_GPU=true" produces a run through the training data:


Logged by: 'prepare_for_iso_live' in 'image_pair.py'
OBS: Using 1 percentile BG value of [32768.0, 32768.0, 32768.0, -5.138273081684019e-06, 0.0]
1012_NSK47177
--- shape: [108 162 100 5]
--- real shape: [108. 162. 100.]
--- pixdim: [1. 1. 1.]


Logged by: 'prepare_for_iso_live' in 'image_pair.py' ...

But it leads to the following error:

Traceback (most recent call last): File "/usr/local/bin/mp", line 11, in <module> load_entry_point('MultiPlanarUNet==0.2.2', 'console_scripts', 'mp')() File "/usr/local/lib/python3.6/dist-packages/MultiPlanarUNet-0.2.2-py3.6.egg/MultiPlanarUNet/bin/mp.py", line 55, in entry_func mod.entry_func(parsed.args) File "/usr/local/lib/python3.6/dist-packages/MultiPlanarUNet-0.2.2-py3.6.egg/MultiPlanarUNet/bin/train.py", line 398, in entry_func raise e File "/usr/local/lib/python3.6/dist-packages/MultiPlanarUNet-0.2.2-py3.6.egg/MultiPlanarUNet/bin/train.py", line 394, in entry_func run(project_dir=project_dir, gpu_mon=gpu_mon, logger=logger, args=args) File "/usr/local/lib/python3.6/dist-packages/MultiPlanarUNet-0.2.2-py3.6.egg/MultiPlanarUNet/bin/train.py", line 331, in run args=args) File "/usr/local/lib/python3.6/dist-packages/MultiPlanarUNet-0.2.2-py3.6.egg/MultiPlanarUNet/bin/train.py", line 238, in get_data_sequences base_path=project_dir) File "/usr/local/lib/python3.6/dist-packages/MultiPlanarUNet-0.2.2-py3.6.egg/MultiPlanarUNet/preprocessing/data_preparation_funcs.py", line 176, in prepare_for_multi_view_unet is_validation=True, **hparams["fit"]) File "/usr/local/lib/python3.6/dist-packages/MultiPlanarUNet-0.2.2-py3.6.egg/MultiPlanarUNet/image/image_pair_loader.py", line 432, in get_sequencer self.prepare_for_iso_live_views(**kwargs) File "/usr/local/lib/python3.6/dist-packages/MultiPlanarUNet-0.2.2-py3.6.egg/MultiPlanarUNet/image/image_pair_loader.py", line 381, in prepare_for_iso_live_views image.prepare_for_iso_live(bg_value, bg_class, scaler) File "/usr/local/lib/python3.6/dist-packages/MultiPlanarUNet-0.2.2-py3.6.egg/MultiPlanarUNet/image/image_pair.py", line 271, in prepare_for_iso_live bg_value = [np.percentile(self.image[..., i], bg_pct) for i in range(self.n_channels)] File "/usr/local/lib/python3.6/dist-packages/MultiPlanarUNet-0.2.2-py3.6.egg/MultiPlanarUNet/image/image_pair.py", line 271, in <listcomp> bg_value = [np.percentile(self.image[..., i], bg_pct) for i in range(self.n_channels)] File "/usr/local/lib/python3.6/dist-packages/MultiPlanarUNet-0.2.2-py3.6.egg/MultiPlanarUNet/image/image_pair.py", line 147, in image dtype=self.im_dtype) File "/usr/local/lib/python3.6/dist-packages/nibabel-2.5.1-py3.6.egg/nibabel/dataobj_images.py", line 348, in get_fdata data = np.asanyarray(self._dataobj).astype(dtype, copy=False) File "/home/akshay/.local/lib/python3.6/site-packages/numpy/core/_asarray.py", line 138, in asanyarray return array(a, dtype, copy=False, order=order, subok=True) File "/usr/local/lib/python3.6/dist-packages/nibabel-2.5.1-py3.6.egg/nibabel/arrayproxy.py", line 355, in __array__ raw_data = self.get_unscaled() File "/usr/local/lib/python3.6/dist-packages/nibabel-2.5.1-py3.6.egg/nibabel/arrayproxy.py", line 350, in get_unscaled mmap=self._mmap) File "/usr/local/lib/python3.6/dist-packages/nibabel-2.5.1-py3.6.egg/nibabel/volumeutils.py", line 524, in array_from_file n_read = infile.readinto(data_bytes) File "/usr/lib/python3.6/gzip.py", line 276, in read return self._buffer.read(size) MemoryError

What could be causing this? Moreover, running "nvidia-smi" shows no active GPU processes. Are there any other ways of checking whether MultiPlanarUNet is interacting with GPUs?

nan weights, nan validation losses

I am getting nan weights (and losses) during training.

OBS: Estimating class counts from 10 images
/opt/conda/lib/python3.7/site-packages/mpunet/utils/utils.py:237: RuntimeWarning: divide by zero encountered in log
  bias = np.log(freq * np.sum(np.exp(freq)))
/opt/conda/lib/python3.7/site-packages/mpunet/utils/utils.py:238: RuntimeWarning: invalid value encountered in true_divide
  bias /= np.linalg.norm(bias)
Setting bias weights on output layer to:
[ 0. -0. -0. -0. -0. -0. -0. -0. -0. -0.  0. -0. nan -0. -0. -0. -0.]

What could be wrong?

I have 17 classes (including background=0), but only 16 are guessed by mp init_project + mp train, so I edit the yml file to use 17.
The views look reasonable, but the prediction always looks like the gray scale image.

dataset folder structure

with current instruction in preparing data, I received an error indicating missing components.

  File "/home/by2026/.local/lib/python3.8/site-packages/mpunet/image/image_pair_loader.py", line 228, in _get_paths_from_list_file
    raise OSError("File '%s' does not exist. Did you specify "
OSError: File '/scratch/by2026/OAI/MultiPlanarUNet/data/train/images/LIST_OF_FILES.txt' does not exist. Did you specify the correct img_subdir?

It happens when there's issue with images = sorted(glob.glob(str(self.images_path / "*.nii*"))) in line 239 for possible filename format problem. The .txt should be noted in the structure description in this situation. Thanks!

AttributeError: 'UNet' object has no attribute 'loss_functions'

Epoch 1/500
417/417 [==============================] - ETA: 0s - loss: 0.1224 - sparse_categorical_accuracy: 0.9805'UNet' object has no attribute 'loss_functions'
Traceback (most recent call last):
File "/usr/local/bin/mp", line 8, in
sys.exit(entry_func())
File "/usr/local/lib/python3.6/dist-packages/mpunet/bin/mp.py", line 55, in entry_func
mod.entry_func(parsed.args)
File "/usr/local/lib/python3.6/dist-packages/mpunet/bin/train.py", line 390, in entry_func
raise e
File "/usr/local/lib/python3.6/dist-packages/mpunet/bin/train.py", line 386, in entry_func
run(project_dir=project_dir, gpu_mon=gpu_mon, logger=logger, args=args)
File "/usr/local/lib/python3.6/dist-packages/mpunet/bin/train.py", line 349, in run
hparams=hparams, no_im=args.no_images, **hparams["fit"])
File "/usr/local/lib/python3.6/dist-packages/mpunet/train/trainer.py", line 161, in fit
raise e
File "/usr/local/lib/python3.6/dist-packages/mpunet/train/trainer.py", line 147, in fit
**fit_kwargs)
File "/usr/local/lib/python3.6/dist-packages/mpunet/train/trainer.py", line 257, in _fit
verbose=verbose
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 66, in _method_wrapper
return method(self, *args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 813, in fit
callbacks.on_epoch_end(epoch, epoch_logs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/callbacks.py", line 365, in on_epoch_end
callback.on_epoch_end(epoch, logs)
File "/usr/local/lib/python3.6/dist-packages/mpunet/callbacks/validation.py", line 263, in on_epoch_end
class_wise_metrics, mean_batch_wise_metrics = self.evalaute()
File "/usr/local/lib/python3.6/dist-packages/mpunet/callbacks/validation.py", line 145, in evalaute
metrics = self.model.loss_functions + self.model.metrics
AttributeError: 'UNet' object has no attribute 'loss_functions'

not understand positional argument dim

I got error while training. i.e(mp train --num_GPUs=1) like init one positional argument :: dim

Traceback (most recent call last):
File "/home/poornesh/.conda/envs/tensor14/bin/mp", line 11, in
load_entry_point('MultiPlanarUNet', 'console_scripts', 'mp')()
File "/home/poornesh/my-projects/MultiPlanarUNet/MultiPlanarUNet/bin/mp.py", line 55, in entry_func
mod.entry_func(parsed.args)
File "/home/poornesh/my-projects/MultiPlanarUNet/MultiPlanarUNet/bin/train.py", line 339, in entry_func
raise e
File "/home/poornesh/my-projects/MultiPlanarUNet/MultiPlanarUNet/bin/train.py", line 335, in entry_func
run(project_dir=project_dir, gpu_mon=gpu_mon, logger=logger, args=args)
File "/home/poornesh/my-projects/MultiPlanarUNet/MultiPlanarUNet/bin/train.py", line 264, in run
args=args)
File "/home/poornesh/my-projects/MultiPlanarUNet/MultiPlanarUNet/bin/train.py", line 171, in get_data_sequences
base_path=project_dir)
File "/home/poornesh/my-projects/MultiPlanarUNet/MultiPlanarUNet/preprocessing/data_preparation_funcs.py", line 194, in prepare_for_3d_unet
**hparams["fit"])
File "/home/poornesh/my-projects/MultiPlanarUNet/MultiPlanarUNet/image/image_pair_loader.py", line 470, in get_sequencer
**kwargs)
File "/home/poornesh/my-projects/MultiPlanarUNet/MultiPlanarUNet/sequences/isotrophic_live_view_sequence_3d.py", line 10, in init
super().init(image_pair_loader, **kwargs)
TypeError: init() missing 1 required positional argument: 'dim'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.