nvidia / clara-train-examples Goto Github PK

Example notebooks demonstrating how to use Clara Train to build Medical Imaging Deep Learning models

License: Apache License 2.0

Jupyter Notebook 15.98% Python 5.19% Shell 2.79% Dockerfile 0.17% HTML 73.36% JavaScript 2.41% Stylus 0.11%

tcia-dac deep-learning python3 pytorch medical-imaging-computing medical-imaging-processing healthcare-imaging clara-train automl

clara-train-examples's Introduction

Clara Train Examples

Overview of Clara Train

Clara Train SDK is a domain optimized developer application framework that includes APIs for AI-Assisted Annotation, making any medical viewer AI capable and v4.1 enables a MONAI based training framework with pre-trained models to start AI development with techniques such as Transfer Learning, Federated Learning, and AutoML.

Clara Train has upgraded its underlying infrastructure from Tensorflow to MONAI. MONAI is an open-source, PyTorch-based framework that provides domain-optimized foundational capabilities for healthcare.

This repo contains Jupyter Notebooks to help you explore the features and capabilities of Clara Train, including AI-Assisted Annotation, AutoML, and Federated Learning.

How to navigate this repository

PyTorch - Clara Train 4.1

If you're using Clara Train 4.1, you'll want to use the PyTorch folder structure. You'll find the README.md and Welcome.ipynb files in the PyTorch/Notebooks directory that will help you get started.

Tensorflow-Deprecated - Clara Train 3.1

If you're still using Clara Train 3.1, we encourage you to upgrade to Clara Train 4.1. You can find information on converting your current Clara 3.1 MMAR's to Clara 4.0 compatible MMAR's on our docs.

If you're still interested in exploring Clara Train 3.1 using our old Jupyter Notebooks, you'll now find them under the Tensorflow-Deprecated folder. You'll find all of the instructs in the README.me file.

clara-train-examples's People

Contributors

Stargazers

Watchers

Forkers

isaacyangsla ajayarunachalam albertvillanova jayantsasikumar dancebean enterprise-medical-intelligence nthu9280 faustyang qiulimoges medicalimageanalysistutorials medical-projects deepmd-io zqyou jingcosmos msalvatori8 dhrg dhinkris relias08 deeayhurr aya-s ziyuexu77 zhongyi80 jinglun-huang devhliu numbercrunch deephivemind anupam1050 ashkan-pirmani ruitvrs 2543tagon kschmidtacr watcharabulsak paztronomer ngocthienle jonnycrunch zjiang0529 tsungjung411 captain320 visiont3lab tkyen1110 liluncheng snapbuy respectableglioma ovalerio elmc0319 craimondo pietrobert ver228 mmuehlmann tea1528 yuling-luo thanhduc1910 imonban aryaman-sh twsc-oliver brightgeevarghese longhronshen pa-wan mulubrhan21 pbinne mexxik cell3d snitgit abcdefg123ddre 5l1v3r1 kdwaha ricardocarvalhods damithasenevirathne ahlaughland hector151 374494125 ahmedcs kkkoaoa stefandenn3r marver17 koln-ai hallegj18 vht-darwin kazuma313 atlains

clara-train-examples's Issues

load NGC models in native tensorflow

Hi,

I need to load the NGC DenseNet121 pre-trained model into native TF.
link to the model: https://ngc.nvidia.com/catalog/models/nvidia:med:clara_xray_classification_chest_amp

Loading the checkpoints in the 'models' directory into densenet121 architecture with model.load_weights('path') is throwing some objects had attributes which were not restored error

TF version 2.1.
Please suggest the correct approach.

MMAR_DP/commands/ directory

Can anyone explain to me where does the "MMAR_DP/commands/" content come from? I am getting an error

chmod: cannot access '/home/dlee/applications/clara/MMAR_DP/commands/*': No such file or directory

As the command directory has no content.
Thank you!

Broken Link

https://github.com/NVIDIA/clara-train-examples/blob/master/NoteBooks/FL/Provisioning.ipynb
"Running this notebook from within clara docker following setup in readMe.md"

The "readMe.md" is a broken link which 404s.

OHIFNotebook broken link

Link to details for setting up OHIF viewer is broken (404)

ModuleNotFoundError: No module named 'medl'

Anytime I try doing something related to training, I get this error:

Error while finding module specification for 'medl.apps.train' (ModuleNotFoundError: No module named 'medl')

I tried searching everywhere, but there is no python module for medl.

Please help!!!

startClaraTrainNoteBooks.sh cannot found driver.

I am trying to run the clara train example, but when I execute the startClaraTrainNoteBooks.sh, the container cannot find the nvidia driver.
I already know that the script executes docker-compose.yml. So I tested whether docker-compose can found the nvidia driver:

services:
  test:
    image: nvidia/cuda:10.2-base
    command: nvidia-smi
    deploy:
      resources:
        reservations:
          devices:
          - driver: nvidia
            capabilities: [gpu]
            device_ids: ['0']

Output:

USER@test:~$ docker-compose up
WARNING: Found orphan containers (hp_nvsmi_1) for this project. If you removed or renamed this service in your compose file, you can run this command with the --remove-orphans flag to clean it up.
Starting hp_test_1 ... done
Attaching to hp_test_1
test_1  | Mon Jun  7 09:01:44 2021
test_1  | +-----------------------------------------------------------------------------+
test_1  | | NVIDIA-SMI 460.27.04    Driver Version: 460.27.04    CUDA Version: 11.2     |
test_1  | |-------------------------------+----------------------+----------------------+
test_1  | | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
test_1  | | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
test_1  | |                               |                      |               MIG M. |
test_1  | |===============================+======================+======================|
test_1  | |   0  GeForce RTX 206...  Off  | 00000000:01:00.0 Off |                  N/A |
test_1  | |  0%   34C    P8    17W / 215W |    100MiB /  7979MiB |      0%      Default |
test_1  | |                               |                      |                  N/A |
test_1  | +-------------------------------+----------------------+----------------------+
test_1  |
test_1  | +-----------------------------------------------------------------------------+
test_1  | | Processes:                                                                  |
test_1  | |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
test_1  | |        ID   ID                                                   Usage      |
test_1  | |=============================================================================|
test_1  | +-----------------------------------------------------------------------------+
hp_test_1 exited with code 0

But the startClaraTrainNoteBooks.sh cna not find it.

root@claratrain:/claraDevDay# nvidia-smi 
root@claratrain:/claraDevDay#

Actually, startDocker.sh can find the driver.

root@c7c2d5597eb8:/claraDevDay# nvidia-smi 
Mon Jun  7 09:11:43 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.27.04    Driver Version: 460.27.04    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 206...  Off  | 00000000:01:00.0 Off |                  N/A |
|  0%   35C    P8    17W / 215W |    100MiB /  7979MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+
root@c7c2d5597eb8:/claraDevDay#

What should I do?

Having trouble running the AIAA-OHIF interagtion with clara train v4.1. Is it possible?

Hi!

I'm following the tutorial in clara-train-examples and I managed to get it to work with Ohif and clara train v4.0.
My objective now is to make it work with a custom Tensorflow algorithm, that should be possible as per this forum topic. Following the Clara Train SDK documentation I should approach this from the v4.1 model, as from this version on the custom models must follow the MONAI Label App template.
My problem arises when using this version of the SDK as backend for the OHIF-AIAA integration in clara-train-examples, even with the models already available in the NGC registry. I load the model clara_pt_spleen_ct_segmentation, I click the "Run Segmentation" button and in OHIF's frontend I get the following events:

Please wait while creating new AIAA Session! -> AIAA Data preparation complete -> Failed to Create AIAA Session... Reason: undefined

I've tried to debug the issue using Chrome's Developer tools and found out that OHIF receives a "422 Unprocessable Entity" when sending the following call to the SDK: curl -X 'PUT' 'http://MY_SERVER:5000/session/?expiry=0&save_as=.nii'. I've checked if the same call is also sent when using the v4.0 SDK and it seems to be exactly the same. In my opinion the OHIF-AIAA integration is not able to process back the response from the API, that changes between versions. Here two responses I get doing by myself calls to that endpoint in the SDK:

v4.0: {"session_id": "dacc097c-e266-11ec-a4aa-0242ac160003"}
v4.1: {"session_id":"5558a0de-e266-11ec-93c5-0242ac150003","session_info":{"name":"5558a0de-e266-11ec-93c5-0242ac150003","path":"/root/.cache/monailabel/sessions/5558a0de-e266-11ec-93c5-0242ac150003","image":"/root/.cache/monailabel/sessions/5558a0de-e266-11ec-93c5-0242ac150003/my_dicom.dcm","meta":{},"create_ts":1654168968,"last_access_ts":1654168968,"expiry":3600}

Could you please confirm if the problem comes indeed from a version mismatch between the OHIF-AIAA extension and Clara Train SDK v4.1?
If so, do you have any plans to update the extension so it works with the newest SDK versions?

Thanks for your time,

Javier

Classification task issue

Facing issues when executing classification task. Is there any sample jupyter notebook providing steps to solve a classification problem using clara train like making changes to config files and script files etc ?as provided for segmentation problems in this repo

AttributeError: 'NoneType' object has no attribute 'load_model'

Hello,

I am trying to implement a task with my model and dataset, and i implement it following the example provided in quickstart (https://nvidia.github.io/NVFlare/examples/hello_pt.html).

Here is the error!
Traceback (most recent call last):
File "<nvflare-0.1.4>/nvflare/private/fed/server/sai.py", line 411, in start_server_training
File "<nvflare-0.1.4>/nvflare/private/fed/server/fed_server.py", line 793, in start
File "<nvflare-0.1.4>/nvflare/private/fed/server/server_model_manager.py", line 113, in initialize
AttributeError: 'NoneType' object has no attribute 'load_model'

Anyone knows how to solve it?
Thanks!

ValueError: The extension "jupyterlab-nvdashboard" does not yet support the current version of JupyterLab.

I just run startClaraTrainNoteBooks.sh scripts and get an error:

ValueError: The extension "jupyterlab-nvdashboard" does not yet support the current version of JupyterLab.


Conflicting Dependencies:
JupyterLab              Extension        Package
>=3.0.6 <3.1.0          >=2.0.0 <3.0.0   @jupyterlab/application
>=3.0.5 <3.1.0          >=2.0.0 <3.0.0   @jupyterlab/apputils
>=5.0.3 <5.1.0          >=4.0.0 <5.0.0   @jupyterlab/coreutils
>=17.0.1 <18.0.0        >=16.4.2 <17.0.0 react
>=17.0.1 <18.0.0        >=16.9.0 <17.0.0 react-dom
See the log file for details:  /tmp/jupyterlab-debug-53dt6_x5.log

Wherein, I have uncomment gpu monitoring tools of docker-compose.
Do you have any suggestions?

How to run the Clara FL server in a realistic situation?

I have tried to use my Azure FQDN as CN.

Instructions I ran:

cd /path/to/server/startup/
./docker.sh
(into container)
cd ./startup/
./start.sh

Error message:

E0712 06:29:11.405616288 388 server_secure_chttp2.cc:81] {"created":"@1626071351.405585088","description":"Nos added out of total 1 resolved","file":"src/core/ext/transport/chttp2/server/chttp2_server.cc","file_line":561,"referenced_errors":[{"created":"@1626005572788","description":"Unable to configure socket","fd":8,"file":"src/core/lib/iomgr/tcp_server_utils_posix_common.cc","file_line":214,"referenced_er{"created":"@1626071351.405567988","description":"Cannot assign requested address","errno":99,"file":"src/core/lib/iomgr/tcp_server_utils_posix_common.le_line":188,"os_error":"Cannot assign requested address","syscall":"bind"}]}]}
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/opt/conda/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "<nvflare-0.1.4>/nvflare/private/fed/app/server/server_train.py", line 212, in
File "<nvflare-0.1.4>/nvflare/private/fed/app/server/server_train.py", line 101, in main
File "<nvflare-0.1.4>/nvflare/private/fed/app/trainers/server_trainer.py", line 68, in deploy
File "<nvflare-0.1.4>/nvflare/private/fed/server/fed_server.py", line 185, in deploy
File "/opt/conda/lib/python3.8/site-packages/grpc/_server.py", line 965, in add_secure_port
return _common.validate_port_binding_result(
File "/opt/conda/lib/python3.8/site-packages/grpc/_common.py", line 166, in validate_port_binding_result
raise RuntimeError(_ERROR_MESSAGE_PORT_BINDING_FAILED % address)
RuntimeError: Failed to bind to address [Azure FQDN]:8002; set GRPC_VERBOSITY=debug environment variable to see detailed error message.

Wrong path config file for BYOC.ipynb Pytorch

I found an error when running BYOC.ipynb in section 2.1 BYO Transformation: Adding random noise to image pixels

/claraDevDay/GettingStarted//custom/trn_BYOC_transform.json: No such file or directory

I realize that trn_BYOC_transform.json is placed in the config folder, not the custom folder

NVDashboard build not working

However, I was interested in the NVDashboard functionality so I commented the

image: nvcr.io/nvidia/clara-train-sdk:v4.0

in the docker compose and then uncommented

build:
   context: ./dockerWGPUDashboardPlugin/    # Project root
   dockerfile: ./Dockerfile                 # Relative to context
image: clara-train-nvdashboard:v4.0

However when I run the startClaraTrainNoteBooks.sh I get a build error from docker saying…

An error occurred.
ValueError: The extension "jupyterlab-nvdashboard" does not yet support the current version of JupyterLab.
Conflicting Dependencies:
JupyterLab              Extension        Package
>=3.1.8 <3.2.0          >=2.0.0 <3.0.0   @jupyterlab/application
>=3.1.8 <3.2.0          >=2.0.0 <3.0.0   @jupyterlab/apputils
>=5.1.8 <5.2.0          >=4.0.0 <5.0.0   @jupyterlab/coreutils
>=17.0.1 <18.0.0        >=16.4.2 <17.0.0 react
>=17.0.1 <18.0.0        >=16.9.0 <17.0.0 react-dom
The command '/bin/sh -c jupyter labextension install jupyterlab-nvdashboard' returned a non-zero code: 1
ERROR: Service 'claratrain' failed to build : Build failed

How can I fix this or is nvdashboard ancient tech?

aiaa server path error in 3D Slicer

My environment:
Server: Ubuntu 18.04 (AIAA)
client: Windows 10 (3D Slicer)

In my case, Nvidia AIAA server is <server_url:port> and not <server_url:port>/v1/mdoles or `<server_url:port>/v1.

Where is AIAA bin to run "AIAA -h " in AIAA.ipynb

I was following https://github.com/NVIDIA/clara-train-examples/blob/master/PyTorch/NoteBooks/AIAA/AIAA.ipynb
where is AIAA bin that should be installed to be able to run "AIAA -h"

models folder not created

Hi there,

I just tried to get the GettingStarted notebook running, but I kept getting the following Error, when running the train_W_Config.sh file

mkdir: cannot create directory ‘/home/lisa/projects/clara-train-examples/PyTorch/NoteBooks/GettingStarted/commands/../models/config_train_Unet’: No such file or director

I believe this occurs, since the models folder is not created before the config_train_Unet folder is created.
It worked after I manually added the models folder in the GettingStarted directory.

Best,
Lisa

Unminified code for FLprovUI.html

Is there any repository for unminified code in FLprovUI.html in clara-train-examples/NoteBooks/FL. We need to include the code in FlprovUI.html in a project to automate the provisioning process so that we can add our custom code on top of it. Any help with this is appreciated.