Git Product home page Git Product logo

clip-driven-universal-model's Introduction

News

CLIP-Driven Universal Model

Paper

This repository provides the official implementation of Universal Model.

CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection
${\color{red} {\textbf{Rank First in Medical Segmentation Decathlon (MSD) Competition}}}$ (see leaderboard)
Jie Liu1, Yixiao Zhang2, Jie-Neng Chen2, Junfei Xiao2, Yongyi Lu2,
Yixuan Yuan1, Alan Yuille2, Yucheng Tang3, Zongwei Zhou2
1 City University of Hong Kong, 2 Johns Hopkins University, 3 NVIDIA
ICCV, 2023
paper | code | slides | poster | talk | blog

Large Language-Image Model for Multi-Organ Segmentation and Cancer Detection from Computed Tomography
Jie Liu1, Yixiao Zhang2, Jie-Neng Chen2, Junfei Xiao2, Yongyi Lu2,
Yixuan Yuan1, Alan Yuille2, Yucheng Tang3, Zongwei Zhou2
1 City University of Hong Kong, 2 Johns Hopkins University, 3 NVIDIA
RSNA, 2023
abstract | code | slides

Model

Architecture Param Download
U-Net 19.08M link
Swin UNETR 62.19M link

Dataset

The post_label can be downloaded via link.

Direct Inference in Your OWN CT scans

  1. Put your all CT scans with nii.gz prefix in one directory. For example, /home/data/ct/.
  2. Run following code.
conda create -n universalmodel python=3.7
conda activate universalmodel
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113 
## please modify according to the CUDA version in your server
pip install 'monai[all]'
pip install -r requirements.txt
cd pretrained_weights/
wget https://www.dropbox.com/s/jdsodw2vemsy8sz/swinunetr.pth
python pred_pseudo.py --data_root_path PATH_TO_IMG_DIR --result_save_path PATH_TO_result_DIR 
## For example: python pred_pseudo.py --data_root_path /home/data/ct/ --result_save_path /home/data/result

0. Preliminary

python3 -m venv universal
source /data/zzhou82/environments/universal/bin/activate

git clone https://github.com/ljwztc/CLIP-Driven-Universal-Model.git
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113
pip install 'monai[all]'
pip install -r requirements.txt
cd pretrained_weights/
wget https://github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/swin_unetr.base_5000ep_f48_lr2e-4_pretrained.pt
wget wget https://www.dropbox.com/s/lh5kuyjxwjsxjpl/Genesis_Chest_CT.pt
cd ../

Dataset Pre-Process

  1. Download the dataset according to the dataset link and arrange the dataset according to the dataset/dataset_list/PAOT.txt.
  2. Modify ORGAN_DATASET_DIR and NUM_WORKER in label_transfer.py
  3. python -W ignore label_transfer.py

Current Template

Index Organ Index Organ
1 Spleen 17 Left Lung
2 Right Kidney 18 Colon
3 Left Kidney 19 Intestine
4 Gall Bladder 20 Rectum
5 Esophagus 21 Bladder
6 Liver 22 Prostate
7 Stomach 23 Left Head of Femur
8 Aorta 24 Right Head of Femur
9 Postcava 25 Celiac Trunk
10 Portal Vein and Splenic Vein 26 Kidney Tumor
11 Pancreas 27 Liver Tumor
12 Right Adrenal Gland 28 Pancreas Tumor
13 Left Adrenal Gland 29 Hepatic Vessel Tumor
14 Duodenum 30 Lung Tumor
15 Hepatic Vessel 31 Colon Tumor
16 Right Lung 32 Kidney Cyst

How expand to new dataset with new organ?

  1. Set the following index for new organ. (e.g. 33 for vermiform appendix)
  2. Check if there are any organs that are not divided into left and right in the dataset. (e.g. kidney, lung, etc.) The RL_Splitd in label_transfer.py is used to processed this case.
  3. Set up a new transfer list for new dataset in TEMPLATE (line 58 in label_transfer.py). (If a new dataset with Intestine labeled as 1 and vermiform appendix labeled as 2, we set the transfer list as [19, 33])
  4. Run the program label_transfer.py to get new post-processing labels.

More details please take a look at common questions

1. Training

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -W ignore -m torch.distributed.launch --nproc_per_node=8 --master_port=1234 train.py --dist True --data_root_path /mnt/zzhou82/PublicAbdominalData/ --num_workers 12 --num_samples 4 --cache_dataset --cache_rate 0.6 --uniform_sample

2. Validation

CUDA_VISIBLE_DEVICES=0 python -W ignore validation.py --data_root_path /mnt/zzhou82/PublicAbdominalData/ --start_epoch 10 --end_epoch 40 --epoch_interval 10 --cache_dataset --cache_rate 0.6

3. Evaluation

CUDA_VISIBLE_DEVICES=0 python -W ignore test.py --resume ./out/epoch_61.pth --data_root_path /mnt/zzhou82/PublicAbdominalData/ --store_result --cache_dataset --cache_rate 0.6

Todo

  • Code release
  • Dataset link
  • Support different backbones (SwinUNETR, Unet, DiNTS, Unet++)
  • Model release
  • Pesudo label release
  • Tutorials for Inference

Acknowledgement

A lot of code is modified from . This work was supported by the Lustgarten Foundation for Pancreatic Cancer Research and partially by the Patrick J. McGovern Foundation Award. We appreciate the effort of the MONAI Team to provide open-source code for the community.

Citation

If you find this repository useful, please consider citing this paper:

@article{liu2023clip,
  title={CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection},
  author={Liu, Jie and Zhang, Yixiao and Chen, Jie-Neng and Xiao, Junfei and Lu, Yongyi and Landman, Bennett A and Yuan, Yixuan and Yuille, Alan and Tang, Yucheng and Zhou, Zongwei},
  journal={arXiv preprint arXiv:2301.00785},
  year={2023}
}

clip-driven-universal-model's People

Contributors

bingogome avatar ljwztc avatar mrgiovanni avatar tangy5 avatar victorbutoi avatar zac2049 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

clip-driven-universal-model's Issues

The installation problems.

Hi, thanks so much for sharing such an awesome job!
Unfortunately, I encountered some issues during the installation. Here are my problems:

  1. After performing the step “source /xxx/universal/bin/activate”, I check the python version, it is 3.11.5, it this correct?
  2. Then I am trying to run the step "pip install torch==1.11.0+cu113...", it occurs "ERROR: Could not find a version that satisfies the requirement...", so I go to the website, download the torch-1.11.0+cu113-cp310-cp310-linux_x86_64.whl packages and install it. But it seems that python version of this packages is 3.10, Will this cause a conflict?
  3. I can't install hyp5=3.6.0, only version 3.1.0, will this version run successfully?
  4. After I finish the installation, I run "python -W ignore label_transfer.py", it shows:
    image
    I think maybe the version of pytorch is not correct?

Waiting for your reply! Thank you so much!

GPU memory fluctuations

hi, thank you for this great work!
I met a problem while I'm using the Swinunetr as backbone:
the graphics memory usage of the GPU would suddenly increase, causing the graphics memory to exceed the VRAM of my GPU; During inference, there may also be significant fluctuations in GPU memory.
Please give me a clue and let me solve this problem.

Inference Issue with unet.pth

I am facing this issue and havent been able to figure out the reason. Please guide.

File "./clip/test.py", line 197, in main
model.load_state_dict(store_dict)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 2152, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Universal_model:
Unexpected key(s) in state_dict: "backbone.down_tr64.ops.0.conv1.weight", "backbone.down_tr64.ops.0.conv1.bias", "backbone.down_tr64.ops.0.bn1.weight", "backbone.down_tr64.ops.0.bn1.bias", "backbone.down_tr64.ops.0.bn1.running_mean", "backbone.down_tr64.ops.0.bn1.running_var", "backbone.down_tr64.ops.0.bn1.num_batches_tracked", "backbone.down_tr64.ops.1.conv1.weight", "backbone.down_tr64.ops.1.conv1.bias", "backbone.down_tr64.ops.1.bn1.weight", "backbone.down_tr64.ops.1.bn1.bias", "backbone.down_tr64.ops.1.bn1.running_mean", "backbone.down_tr64.ops.1.bn1.running_var", "backbone.down_tr64.ops.1.bn1.num_batches_tracked", "backbone.down_tr128.ops.0.conv1.weight", "backbone.down_tr128.ops.0.conv1.bias", "backbone.down_tr128.ops.0.bn1.weight", "backbone.down_tr128.ops.0.bn1.bias", "backbone.down_tr128.ops.0.bn1.running_mean", "backbone.down_tr128.ops.0.bn1.running_var", "backbone.down_tr128.ops.0.bn1.num_batches_tracked", "backbone.down_tr128.ops.1.conv1.weight", "backbone.down_tr128.ops.1.conv1.bias", "backbone.down_tr128.ops.1.bn1.weight", "backbone.down_tr128.ops.1.bn1.bias", "backbone.down_tr128.ops.1.bn1.running_mean", "backbone.down_tr128.ops.1.bn1.running_var", "backbone.down_tr128.ops.1.bn1.num_batches_tracked", "backbone.down_tr256.ops.0.conv1.weight", "backbone.down_tr256.ops.0.conv1.bias", "backbone.down_tr256.ops.0.bn1.weight", "backbone.down_tr256.ops.0.bn1.bias", "backbone.down_tr256.ops.0.bn1.running_mean", "backbone.down_tr256.ops.0.bn1.running_var", "backbone.down_tr256.ops.0.bn1.num_batches_tracked", "backbone.down_tr256.ops.1.conv1.weight", "backbone.down_tr256.ops.1.conv1.bias", "backbone.down_tr256.ops.1.bn1.weight", "backbone.down_tr256.ops.1.bn1.bias", "backbone.down_tr256.ops.1.bn1.running_mean", "backbone.down_tr256.ops.1.bn1.running_var", "backbone.down_tr256.ops.1.bn1.num_batches_tracked", "backbone.down_tr512.ops.0.conv1.weight", "backbone.down_tr512.ops.0.conv1.bias", "backbone.down_tr512.ops.0.bn1.weight", "backbone.down_tr512.ops.0.bn1.bias", "backbone.down_tr512.ops.0.bn1.running_mean", "backbone.down_tr512.ops.0.bn1.running_var", "backbone.down_tr512.ops.0.bn1.num_batches_tracked", "backbone.down_tr512.ops.1.conv1.weight", "backbone.down_tr512.ops.1.conv1.bias", "backbone.down_tr512.ops.1.bn1.weight", "backbone.down_tr512.ops.1.bn1.bias", "backbone.down_tr512.ops.1.bn1.running_mean", "backbone.down_tr512.ops.1.bn1.running_var", "backbone.down_tr512.ops.1.bn1.num_batches_tracked", "backbone.up_tr256.up_conv.weight", "backbone.up_tr256.up_conv.bias", "backbone.up_tr256.ops.0.conv1.weight", "backbone.up_tr256.ops.0.conv1.bias", "backbone.up_tr256.ops.0.bn1.weight", "backbone.up_tr256.ops.0.bn1.bias", "backbone.up_tr256.ops.0.bn1.running_mean", "backbone.up_tr256.ops.0.bn1.running_var", "backbone.up_tr256.ops.0.bn1.num_batches_tracked", "backbone.up_tr256.ops.1.conv1.weight", "backbone.up_tr256.ops.1.conv1.bias", "backbone.up_tr256.ops.1.bn1.weight", "backbone.up_tr256.ops.1.bn1.bias", "backbone.up_tr256.ops.1.bn1.running_mean", "backbone.up_tr256.ops.1.bn1.running_var", "backbone.up_tr256.ops.1.bn1.num_batches_tracked", "backbone.up_tr128.up_conv.weight", "backbone.up_tr128.up_conv.bias", "backbone.up_tr128.ops.0.conv1.weight", "backbone.up_tr128.ops.0.conv1.bias", "backbone.up_tr128.ops.0.bn1.weight", "backbone.up_tr128.ops.0.bn1.bias", "backbone.up_tr128.ops.0.bn1.running_mean", "backbone.up_tr128.ops.0.bn1.running_var", "backbone.up_tr128.ops.0.bn1.num_batches_tracked", "backbone.up_tr128.ops.1.conv1.weight", "backbone.up_tr128.ops.1.conv1.bias", "backbone.up_tr128.ops.1.bn1.weight", "backbone.up_tr128.ops.1.bn1.bias", "backbone.up_tr128.ops.1.bn1.running_mean", "backbone.up_tr128.ops.1.bn1.running_var", "backbone.up_tr128.ops.1.bn1.num_batches_tracked", "backbone.up_tr64.up_conv.weight", "backbone.up_tr64.up_conv.bias", "backbone.up_tr64.ops.0.conv1.weight", "backbone.up_tr64.ops.0.conv1.bias", "backbone.up_tr64.ops.0.bn1.weight", "backbone.up_tr64.ops.0.bn1.bias", "backbone.up_tr64.ops.0.bn1.running_mean", "backbone.up_tr64.ops.0.bn1.running_var", "backbone.up_tr64.ops.0.bn1.num_batches_tracked", "backbone.up_tr64.ops.1.conv1.weight", "backbone.up_tr64.ops.1.conv1.bias", "backbone.up_tr64.ops.1.bn1.weight", "backbone.up_tr64.ops.1.bn1.bias", "backbone.up_tr64.ops.1.bn1.running_mean", "backbone.up_tr64.ops.1.bn1.running_var", "backbone.up_tr64.ops.1.bn1.num_batches_tracked".
size mismatch for precls_conv.0.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([48]).
size mismatch for precls_conv.0.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([48]).
size mismatch for precls_conv.2.weight: copying a param with shape torch.Size([8, 64, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([8, 48, 1, 1, 1]).
size mismatch for GAP.0.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for GAP.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for GAP.3.weight: copying a param with shape torch.Size([256, 512, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 768, 1, 1, 1]).

Question about model weights.

Thanks for sharing your well-trained weights.
There are two problems when trying.

  1. The keys of weights and model.state_dict() are not matched well. This is just a small problem which I fix it by force matching the keys.
  2. I tested swinunetr.pth on BTCV dataset but achieve bad performance. (using test.py)

Task01| Spleen: 0.0322, Right Kidney: 0.0960, Left Kidney: 0.0265, Gall Bladder: 0.0925, Esophagus: 0.0000, Liver: 0.4375, Stomach: 0.0439, Aorta: 0.0153, Postcava: 0.0270, Portal Vein and Splenic Vein: 0.0281, Pancreas: 0.0403, Right Adrenal Gland: 0.0000, Left Adrenal Gland: 0.0015, Duodenum: nan,
Task01_2| Spleen: nan, Left Kidney: nan, Gall Bladder: nan, Esophagus: nan, Liver: nan, Stomach: nan, Pancreas: nan, Duodenum: nan,
Task02| Spleen: nan, Left Kidney: nan, Gall Bladder: nan, Esophagus: nan, Liver: nan, Stomach: nan, Pancreas: nan, Duodenum: nan,
Task03| Liver: nan,
Task04| Liver: nan, Liver Tumor: nan,
Task05| Right Kidney: nan, Left Kidney: nan, Kidney Tumor: nan, Kidney Cyst: nan,
Task06| Spleen: nan, Right Kidney: nan, Left Kidney: nan, Gall Bladder: nan, Liver: nan, Stomach: nan, Pancreas: nan, Right Lung: nan, Left Lung: nan,
Task07| Liver: nan, Spleen: nan, Left Kidney: nan, Right Kidney: nan, Stomach: nan, Gall Bladder: nan, Esophagus: nan, Pancreas: nan, Duodenum: nan, Colon: nan, Intestine: nan, Right Adrenal Gland: nan, Left Adrenal Gland: nan, Rectum: nan, Bladder: nan, Left Head of Femur: nan, Right Head of Femur: nan,
Task08| Liver: nan, Right Kidney: nan, Left Kidney: nan, Spleen: nan, Pancreas: nan,
Task09| Spleen: nan, Right Kidney: nan, Left Kidney: nan, Gall Bladder: nan, Esophagus: nan, Liver: nan, Stomach: nan, Aorta: nan, Postcava: nan, Pancreas: nan, Right Adrenal Gland: nan, Left Adrenal Gland: nan, Duodenum: nan, Bladder: nan, Prostate: nan,
Task12| Liver: nan, Bladder: nan, Right Lung: nan, Left Lung: nan, Right Kidney: nan, Left Kidney: nan,
Task13| Liver: nan, Right Kidney: nan, Left Kidney: nan, Spleen: nan, Pancreas: nan, Aorta: nan, Postcava: nan, Stomach: nan, Gall Bladder: nan, Esophagus: nan, Right Adrenal Gland: nan, Left Adrenal Gland: nan, Celiac Truck: nan,
Task14| Pancreas: nan, Pancreas Tumor: nan,
Task10_03| Liver: nan, Liver Tumor: nan,
Task10_06| Lung Tumor: nan,
Task10_07| Pancreas: nan, Pancreas Tumor: nan,
Task10_08| Hepatic Vessel: nan, Hepatic Vessel Tumor: nan,
Task10_09| Spleen: nan,
Task10_10| Colon Tumor: nan,
Task15| Spleen: nan, Right Kidney: nan, Left Kidney: nan, Gall Bladder: nan, Esophagus: nan, Liver: nan, Stomach: nan, Aorta: nan, Postcava: nan, Portal Vein and Splenic Vein: nan, Pancreas: nan, Right Adrenal Gland: nan, Left Adrenal Gland: nan, Duodenum: nan, Hepatic Vessel: nan, Right Lung: nan, Left Lung: nan,
Average | Spleen: 0.0322, Right Kidney: 0.0960, Left Kidney: 0.0265, Gall Bladder: 0.0925, Esophagus: 0.0000, Liver: 0.4375, Stomach: 0.0439, Aorta: 0.0153, Postcava: 0.0270, Portal Vein and Splenic Vein: 0.0281, Pancreas: 0.0403, Right Adrenal Gland: 0.0000, Left Adrenal Gland: 0.0015, Duodenum: nan, Hepatic Vessel: nan, Right Lung: nan, Left Lung: nan, Colon: nan, Intestine: nan, Rectum: nan, Bladder: nan, Prostate: nan, Left Head of Femur: nan, Right Head of Femur: nan, Celiac Truck: nan, Kidney Tumor: nan, Liver Tumor: nan, Pancreas Tumor: nan, Hepatic Vessel Tumor: nan, Lung Tumor: nan, Colon Tumor: nan, Kidney Cyst: nan,
average: nan,

How can i solve it ?

Confusion about embedding size

Hi, thank you for your excellent work.

I have a question about the size of the word embedding in the work. As the CLIP method given in the link, my embedding output is n*512, but your pre-trained embedding size is n*256. Is there something wrong with my input?

Thanks for any help.

BTCV

Again bothering you. About BTCV paper results reproduction. What I haven't figured out is that only the training set and the validation set are used in the five-fold cross-validation, so what does the test set do? Is the test set part of the training set when the entire model is pre-trained? Looking forward to your response.

Confusion about dataset 02 and 03

Hi,
Thanks for you great work!I'm a post graduate student trying to follow your work.I met some issue during my reproduction.
In dataset 02, I found that the ground truth regions in the dataset are too small, resulting in the model's dice,recall and precision are all zero in the test stage.
In dataset 03,there were two issues with the dataset: mismatch between the annotations and the actual liver positions, and some CT images being completely black.
So,I wonder if I've missed something when I preprocessed the dataset which I obtained from the link given on the github.
Best,
Kevin Chen

Training about convergence problem

Hello, may I ask how many epochs you have trained and the dice loss value at the final convergence

When I trained using three datasets (liver, kidney, and tumor), 04-LITS, 05KITS, and 10-3liver, I trained for 200 epochs. When finally converged, the dice loss value was around 0.7,I think it's very high. However, in the test set, the dice value range for liver and kidney was around 0.96, and the dice value for tumor was around 0.65. I don't understand why the dice loss value was so high during training convergence. According to conventional understanding, when training convergence, dice predicted (1-0.7=) 0.3 correctly, The test results should also be very low, but the test set results are normal. Could you please answer them? Thank you very much

Cosine similarity in Figure 1

Nice work!

I am trying to repeat the cosine similarity you showed in Figure 1. However, I found that the cosine similarity I computed by the CLIP text encoder differs from yours. The code about extracting the text embeddings and computing cosine similarity is listed.

`
import clip
import torch
from sklearn.metrics.pairwise import cosine_similarity

device = "cuda" if torch.cuda.is_available() else "cpu"
model, _ = clip.load("ViT-B/32", device=device)

text = ['Liver Tumor', 'Hepatic Vessel', 'Right Kidney', 'Left Kidney', 'Kidney Tumor', 'Liver']
input_text1 = ['A photo of a ' + i for i in text]
input_text2 = ['There is {} in this computerized tomography'.format(i) for i in text]
input_text3 = ['A computerized tomography of a ' + i for i in text]

with torch.no_grad():
input_text1 = clip.tokenize(input_text1).to(device)
features1 = model.encode_text(input_text1)
np.save('universal_model_clip_v1.npy', features1.cpu().detach().numpy())

input_text2 = clip.tokenize(input_text2).to(device)
features2 = model.encode_text(input_text2)
np.save('universal_model_clip_v2.npy', features2.cpu().detach().numpy())

input_text3 = clip.tokenize(input_text3).to(device)
features3 = model.encode_text(input_text3)
np.save('universal_model_clip_v3.npy', features3.cpu().detach().numpy())
print('1')

cls_weight = np.load('universal_model_clip_v3.npy')
sim3 = cosine_similarity(cls_weight)
plt.imshow(sim3, cmap='viridis', interpolation='nearest')
plt.colorbar()
plt.show()
`
The similarity is shown in the following figure.

image

For example, the similarity score between "Liver Tumor" and "Right Kidney" is close to the score between "Liver Tumor" and "Hepatic Vessel". However, as shown in Figure 1, their values are very different. What method did you use to compute the cosine similarity score? Thanks very much.

Question about 3D dice score

Hello, I would like to ask you some algorithm questions regarding the 3D dice score.

Firstly, concerning the issue of the ground truth being entirely black, I noticed in your utils.py that when you are writing the dice_score function, it seems that you add 1 to the denominator but not to the numerator. With this approach, in the case where both the ground truth and the prediction are entirely black, will the dice score be calculated as 0?

Secondly, regarding the calculation method of the 3D dice score, in the GitHub repository of the Medical SAM Adapter paper (https://github.com/WuJunde/Medical-SAM-Adapter), I observed that they calculate the dice score for each slice of the 3D volume separately and then average the scores. However, it seems like your dice_score function computes the dice score for the entire volume together. I believe both methods are reasonable, but the MONAI dice score algorithm appears to follow the former approach. So, I wanted to ask if you know which algorithm was used during the MSD challenge (the experiment results in your paper), as I need to determine the dice algorithm to assess if my model achieves state-of-the-art performance.

I appreciate your answers to the above two questions.

About POAT

Appreciate first

How can I test some single data for visualized result?
And I also have trouble in running test.py. It seems fail to find a file ending with .h5
When I run Evaluation which is CUDA_VISIBLE_DEVICES=0 python -W ignore test.py --resume ./out/epoch_61.pth --data_root_path /mnt/zzhou82/PublicAbdominalData/ --store_result --cache_dataset --cache_rate 0.6. It tips that FileNotFoundError: [Errno 2] No such file or directory: './out/epoch_61.pth'

KeyError: 'net' when trying to run test

I was trying to run the test script with the command:

CUDA_VISIBLE_DEVICES=0 python -W ignore test.py --resume ./out/epoch_61.pth --data_root_path /mnt/zzhou82/PublicAbdominalData/ --store_result --cache_dataset --cache_rate 0.6

and got:

(universal) ~/CLIP-Driven-Universal-Model$ CUDA_VISIBLE_DEVICES=0 python -W ignore test.py --resume pretrained_weights/swin_unetr.base_5000ep_f48_lr2e-4_pretrained.pt --data_root_path /gpu_home/bori/CLIP-Driven-Universal-Model/data --store_result --cache_dataset --cache_rate 0.6 --dataset_list "pancreas_val"
Traceback (most recent call last):
  File "/gpu_home/bori/CLIP-Driven-Universal-Model/test.py", line 209, in <module>
    main()
  File "/gpu_home/bori/CLIP-Driven-Universal-Model/test.py", line 185, in main
    load_dict = checkpoint['net']
KeyError: 'net'

If i print out the keys:

ipdb> checkpoint.keys()
dict_keys(['epoch', 'best_acc', 'state_dict'])

I tried substituting the "net" with "state_dict" key but it fails.

Some info about my setup:

 OS: Ubuntu 22.04 jammy
 Kernel: x86_64 Linux 6.1.12-060112-generic
 Uptime: 39d 3h 50m
 Packages: 2087
 Shell: bash 5.1.16
 Disk: 2.7T / 10T (29%)
 CPU: 13th Gen Intel Core i7-13700KF @ 24x 5.3GHz [53.0°C]
 GPU: NVIDIA RTX A6000
 RAM: 5237MiB / 96377MiB

Python 3.10.13

Diceloss does not decrease during training, and Dice is all 0 during validation

Hello!I am training on BTCV,1000epochs, where diceloss oscillates continuously without decreasing, while celoss decreases. When I use my checkpoint to validate, the result is as follows:
Spleen: dice 0.0000, recall 0.0000, precision nan
Right Kidney: dice 0.0000, recall 0.0000, precision nan
Left Kidney: dice 0.0000, recall 0.0000, precision nan
Esophagus: dice 0.0000, recall 0.0000, precision nan
Liver: dice 0.0000, recall 0.0000, precision nan
Stomach: dice 0.0000, recall 0.0000, precision nan
Aorta: dice 0.0000, recall 0.0000, precision nan
Postcava: dice 0.0000, recall 0.0000, precision nan
Portal Vein and Splenic Vein: dice 0.0000, recall 0.0000, precision nan
Pancreas: dice 0.0000, recall 0.0000, precision nan
Right Adrenal Gland: dice 0.0000, recall 0.0000, precision nan
Left Adrenal Gland: dice 0.0000, recall 0.0000, precision nan
case01_Multi-Atlas_Labeling/label/label0035| Spleen: 0.0000, Right Kidney: 0.0000, Left Kidney: 0.0000, Eso
phagus: 0.0000, Liver: 0.0000, Stomach: 0.0000, Aorta: 0.0000, Postcava: 0.0000, Portal Vein and Splenic Ve
in: 0.0000, Pancreas: 0.0000, Right Adrenal Gland: 0.0000, Left Adrenal Gland: 0.0000,
Have you ever encountered a similar problem?
I hope to receive your reply, thank you!
Here is codes for training:
`def train(args, train_loader, model, optimizer, loss_seg_DICE, loss_seg_CE):
model.train()
loss_bce_ave = 0
loss_dice_ave = 0
epoch_iterator = tqdm(
train_loader, desc="Training (X / X Steps) (loss=X.X)", dynamic_ncols=True
)
for step, batch in enumerate(epoch_iterator):
x, y, name = batch["image"].to(args.device), batch["post_label"].float().to(args.device), batch['name']
torch.cuda.empty_cache()
with torch.cuda.amp.autocast():
logit_map = model(x)
torch.cuda.empty_cache()

    term_seg_Dice = loss_seg_DICE.forward(logit_map, y, name, TEMPLATE)
    term_seg_BCE = loss_seg_CE.forward(logit_map, y, name, TEMPLATE)
    loss = term_seg_BCE + term_seg_Dice
    loss.backward()
    optimizer.step()
    optimizer.zero_grad()
    epoch_iterator.set_description(
        "Epoch=%d: Training (%d / %d Steps) (dice_loss=%2.5f, bce_loss=%2.5f)" % (
            args.epoch, step, len(train_loader), term_seg_Dice.item(), term_seg_BCE.item())
    )
    loss_bce_ave += term_seg_BCE.item()
    loss_dice_ave += term_seg_Dice.item()
    torch.cuda.empty_cache()
print('Epoch=%d: ave_dice_loss=%2.5f, ave_bce_loss=%2.5f' % (args.epoch, loss_dice_ave/len(epoch_iterator), loss_bce_ave/len(epoch_iterator)))

return loss_dice_ave/len(epoch_iterator), loss_bce_ave/len(epoch_iterator)

`

Question about the hyper-parameters for MSD

For MSD datasets, such as Task10, it seems that the crop size would be [96,96,96]. However, I also noticed that the annotations in the https://github.com/ljwztc/CLIP-Driven-Universal-Model/blob/c8e829eee7769fbc3120b9fe7687bb73402dfc87/dataset/dataloader.py#L260C79-L260C81 is 192,192,64. So which is correct size for feeding the network. By the way, can you provide some logs containg the hyper-parameters for training or more details about hyper-paramerters for training, especially for the MSD Task10,6,7. Thank you.

training on parts of the datasets

Hello, I want to train some datasets in PAOT.txt, such as 01_ Multi Atlas_ Labeling and 02_ TCIA_ Pancreas-CT and 03_ CHAOS, may I ask how to modify the label_transfer.py?The TRANSFER_LIST should be what?TRANSFER_LIST = ['01', '02', '03']?and do I need to modify the num_classes?if so,What should I change it to?
I tried to train on the BTCV dataset, but my diceloss and bce loss did not decrease, and the label_transfer.py is set like this
settings

I hope to receive your guidance, thank you!

Organ list

Hello, I would like to ask some questions about training a new dataset. Can I reposition organ 1 to another organ? For example, the original organ 1 is the spleen, but I would like to define it as the left atrium.

05_KiTS which ground truth labels?

Hi, thanks for the great work!

Could you please tell which labels you used for the Kits dataset. In the ground truth, there are three different aggregations for the ground truths:

aggregated_AND_seg.nii.gz
aggregated_MAJ_seg.nii.gz
aggregated_OR_seg.nii.gz

01 BTCV dataset labels for testing subjects

Hi, from the dataset list in PAOT.txt I noticed that for BTCV dataset, you provided directories for the labels that corresponds to the testing subjects (img0061 to img0080). Can I check if these labels are generated using the pred_pseudo.py file? Or if there is another way to assess these labels? Thank you!

Question about Table 4

Hi, @ljwztc @MrGiovanni

Thanks for providing such a nice work. Kindly inquire two questions:

  1. How to get testing images/labels of LiTS and KiTS?
  2. How to evaluate the performance that reported in Table 4?

Best.

How do you use the CLIP model?

Thank you very much for your work. I have some doubts. Where is your text input? The input of our model forwrad is an image. I did not find images of different parts and input different text features.

CUDA out of memory inference half way

When testing the second image, I reported an error
33%|████████████████████████ | 1/3 [01:33<03:06, 93.34s/it]
Traceback (most recent call last):
File "mytest.py", line 223, in
main()
File "mytest.py", line 219, in main
validation(model, test_loader, val_transforms, args)
File "mytest.py", line 55, in validation
pred = sliding_window_inference(image, (args.roi_x, args.roi_y, args.roi_z), 1, model, overlap=0.5, mode='gaussian')
File "/python3.8/site-packages/monai/inferers/utils.py", line 215, in sliding_window_inference
output_image_list.append(torch.zeros(output_shape, dtype=compute_dtype, device=device))
RuntimeError: CUDA out of memory. Tried to allocate 2.70 GiB (GPU 0; 11.91 GiB total capacity; 8.87 GiB already allocated; 2.31 GiB free; 8.90 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
I try with torch.no_grad(): and so on...but it is not solved

MSD Leaderboard

Hi,

really great work. I've just a short question about how you achieved the MRI dataset results of the MSD leaderboard?

Best
Constantin

Question about MSD dataset 10_07

你好我想請問一下 utils.py , label_transfer.py的部分
您的分類有32種,我只需要Pancreas , Pancreas Tumor這兩種
我除了需要對編號進行修改還有需要針對哪些地方做更動嗎
NUM_CLASS
TEMPLATE
POST_TUMOR_DICT
MERGE_MAPPING_v1
MERGE_MAPPING_v2
TUMOR_ORGAN
函式裡的編號
例如organ
post_pred_mask
dataset_index
organ_list

Error when validation and test

When validation or test, showing following error:
RuntimeError: Error(s) in loading state_dict for Universal_model:
Unexpected key(s) in state_dict: "swinViT.patch_embed.proj.weight", "swinViT.patch_embed.proj.bias", "swinViT.layers1.0.blocks.0.norm1.weight", "swinViT.layers1.0 .blocks.0.norm1.bias", "swinViT.layers1.0.blocks.0.attn.relative_position_bias_table", "swinViT.layers1.0.blocks.0.attn.relative_position_index", "swinViT.layers1.0.block s.0.attn.qkv.weight", "swinViT.layers1.0.blocks.0.attn.qkv.bias", "swinViT.layers1.0.blocks.0.attn.proj.weight", "swinViT.layers1.0.blocks.0.attn.proj.bias", "swinViT.lay ers1.0.blocks.0.norm2.weight", "swinViT.layers1.0.blocks.0.norm2.bias", "swinViT.layers1.0.blocks.0.mlp.linear1.weight", "swinViT.layers1.0.blocks.0.mlp.linear1.bias", "s winViT.layers1.0.blocks.0.mlp.linear2.weight", "swinViT.layers1.0.blocks.0.mlp.linear2.bias", "swinViT.layers1.0.blocks.1.norm1.weight", "swinViT.layers1.0.blocks.1.norm1 .bias", "swinViT.layers1.0.blocks.1.attn.relative_position_bias_table", "swinViT.layers1.0.blocks.1.attn.relative_position_index", "swinViT.layers1.0.blocks.1.attn.qkv.we ight", "swinViT.layers1.0.blocks.1.attn.qkv.bias", "swinViT.layers1.0.blocks.1.attn.proj.weight", "swinViT.layers1.0.blocks.1.attn.proj.bias", "swinViT.layers1.0.blocks.1 .norm2.weight", "swinViT.layers1.0.blocks.1.norm2.bias", "swinViT.layers1.0.blocks.1.mlp.linear1.weight", "swinViT.layers1.0.blocks.1.mlp.linear1.bias", "swinViT.layers1. 0.blocks.1.mlp.linear2.weight", "swinViT.layers1.0.blocks.1.mlp.linear2.bias", "swinViT.layers1.0.downsample.reduction.weight", "swinViT.layers1.0.downsample.norm.weight" , "swinViT.layers1.0.downsample.norm.bias", "swinViT.layers2.0.blocks.0.norm1.weight", "swinViT.layers2.0.blocks.0.norm1.bias", "swinViT.layers2.0.blocks.0.attn.relative_ position_bias_table", "swinViT.layers2.0.blocks.0.attn.relative_position_index", "swinViT.layers2.0.blocks.0.attn.qkv.weight", "swinViT.layers2.0.blocks.0.attn.qkv.bias", "swinViT.layers2.0.blocks.0.attn.proj.weight", "swinViT.layers2.0.blocks.0.attn.proj.bias", "swinViT.layers2.0.blocks.0.norm2.weight", "swinViT.layers2.0.blocks.0.norm2. bias", "swinViT.layers2.0.blocks.0.mlp.linear1.weight", "swinViT.layers2.0.blocks.0.mlp.linear1.bias", "swinViT.layers2.0.blocks.0.mlp.linear2.weight", "swinViT.layers2.0 .blocks.0.mlp.linear2.bias", "swinViT.layers2.0.blocks.1.norm1.weight", "swinViT.layers2.0.blocks.1.norm1.bias", "swinViT.layers2.0.blocks.1.attn.relative_position_bias_t able", "swinViT.layers2.0.blocks.1.attn.relative_position_index", "swinViT.layers2.0.blocks.1.attn.qkv.weight", "swinViT.layers2.0.blocks.1.attn.qkv.bias", "swinViT.layer s2.0.blocks.1.attn.proj.weight", "swinViT.layers2.0.blocks.1.attn.proj.bias", "swinViT.layers2.0.blocks.1.norm2.weight", "swinViT.layers2.0.blocks.1.norm2.bias", "swinViT .layers2.0.blocks.1.mlp.linear1.weight", "swinViT.layers2.0.blocks.1.mlp.linear1.bias", "swinViT.layers2.0.blocks.1.mlp.linear2.weight", "swinViT.layers2.0.blocks.1.mlp.l inear2.bias", "swinViT.layers2.0.downsample.reduction.weight", "swinViT.layers2.0.downsample.norm.weight", "swinViT.layers2.0.downsample.norm.bias", "swinViT.layers3.0.bl ocks.0.norm1.weight", "swinViT.layers3.0.blocks.0.norm1.bias", "swinViT.layers3.0.blocks.0.attn.relative_position_bias_table", "swinViT.layers3.0.blocks.0.attn.relative_p osition_index", "swinViT.layers3.0.blocks.0.attn.qkv.weight", "swinViT.layers3.0.blocks.0.attn.qkv.bias", "swinViT.layers3.0.blocks.0.attn.proj.weight", "swinViT.layers3. 0.blocks.0.attn.proj.bias", "swinViT.layers3.0.blocks.0.norm2.weight", "swinViT.layers3.0.blocks.0.norm2.bias", "swinViT.layers3.0.blocks.0.mlp.linear1.weight", "swinViT. layers3.0.blocks.0.mlp.linear1.bias", "swinViT.layers3.0.blocks.0.mlp.linear2.weight", "swinViT.layers3.0.blocks.0.mlp.linear2.bias", "swinViT.layers3.0.blocks.1.norm1.we ight", "swinViT.layers3.0.blocks.1.norm1.bias", "swinViT.layers3.0.blocks.1.attn.relative_position_bias_table", "swinViT.layers3.0.blocks.1.attn.relative_position_index", "swinViT.layers3.0.blocks.1.attn.qkv.weight", "swinViT.layers3.0.blocks.1.attn.qkv.bias", "swinViT.layers3.0.blocks.1.attn.proj.weight", "swinViT.layers3.0.blocks.1.attn .proj.bias", "swinViT.layers3.0.blocks.1.norm2.weight", "swinViT.layers3.0.blocks.1.norm2.bias", "swinViT.layers3.0.blocks.1.mlp.linear1.weight", "swinViT.layers3.0.block s.1.mlp.linear1.bias", "swinViT.layers3.0.blocks.1.mlp.linear2.weight", "swinViT.layers3.0.blocks.1.mlp.linear2.bias", "swinViT.layers3.0.downsample.reduction.weight", "s winViT.layers3.0.downsample.norm.weight", "swinViT.layers3.0.downsample.norm.bias", "swinViT.layers4.0.blocks.0.norm1.weight", "swinViT.layers4.0.blocks.0.norm1.bias", "s winViT.layers4.0.blocks.0.attn.relative_position_bias_table", "swinViT.layers4.0.blocks.0.attn.relative_position_index", "swinViT.layers4.0.blocks.0.attn.qkv.weight", "sw inViT.layers4.0.blocks.0.attn.qkv.bias", "swinViT.layers4.0.blocks.0.attn.proj.weight", "swinViT.layers4.0.blocks.0.attn.proj.bias", "swinViT.layers4.0.blocks.0.norm2.wei ght", "swinViT.layers4.0.blocks.0.norm2.bias", "swinViT.layers4.0.blocks.0.mlp.linear1.weight", "swinViT.layers4.0.blocks.0.mlp.linear1.bias", "swinViT.layers4.0.blocks.0 .mlp.linear2.weight", "swinViT.layers4.0.blocks.0.mlp.linear2.bias", "swinViT.layers4.0.blocks.1.norm1.weight", "swinViT.layers4.0.blocks.1.norm1.bias", "swinViT.layers4. 0.blocks.1.attn.relative_position_bias_table", "swinViT.layers4.0.blocks.1.attn.relative_position_index", "swinViT.layers4.0.blocks.1.attn.qkv.weight", "swinViT.layers4.0 .blocks.1.attn.qkv.bias", "swinViT.layers4.0.blocks.1.attn.proj.weight", "swinViT.layers4.0.blocks.1.attn.proj.bias", "swinViT.layers4.0.blocks.1.norm2.weight", "swinViT. layers4.0.blocks.1.norm2.bias", "swinViT.layers4.0.blocks.1.mlp.linear1.weight", "swinViT.layers4.0.blocks.1.mlp.linear1.bias", "swinViT.layers4.0.blocks.1.mlp.linear2.we ight", "swinViT.layers4.0.blocks.1.mlp.linear2.bias", "swinViT.layers4.0.downsample.reduction.weight", "swinViT.layers4.0.downsample.norm.weight", "swinViT.layers4.0.down sample.norm.bias", "encoder1.layer.conv1.conv.weight", "encoder1.layer.conv2.conv.weight", "encoder1.layer.conv3.conv.weight", "encoder2.layer.conv1.conv.weight", "encode r2.layer.conv2.conv.weight", "encoder3.layer.conv1.conv.weight", "encoder3.layer.conv2.conv.weight", "encoder4.layer.conv1.conv.weight", "encoder4.layer.conv2.conv.weight ", "encoder10.layer.conv1.conv.weight", "encoder10.layer.conv2.conv.weight", "decoder5.transp_conv.conv.weight", "decoder5.conv_block.conv1.conv.weight", "decoder5.conv_b lock.conv2.conv.weight", "decoder5.conv_block.conv3.conv.weight", "decoder4.transp_conv.conv.weight", "decoder4.conv_block.conv1.conv.weight", "decoder4.conv_block.conv2. conv.weight", "decoder4.conv_block.conv3.conv.weight", "decoder3.transp_conv.conv.weight", "decoder3.conv_block.conv1.conv.weight", "decoder3.conv_block.conv2.conv.weight ", "decoder3.conv_block.conv3.conv.weight", "decoder2.transp_conv.conv.weight", "decoder2.conv_block.conv1.conv.weight", "decoder2.conv_block.conv2.conv.weight", "decoder 2.conv_block.conv3.conv.weight", "decoder1.transp_conv.conv.weight", "decoder1.conv_block.conv1.conv.weight", "decoder1.conv_block.conv2.conv.weight", "decoder1.conv_bloc k.conv3.conv.weight", "0.weight", "0.bias", "2.weight", "2.bias", "3.weight", "3.bias", "weight", "bias".

Dataset

hello, I find your work very interesting, and I encountered some problems while reproducing it. I noticed that both the paper and the txt files in the datalist directory mention a total of 14 datasets, but it seems that there are only links to 12 datasets on this readme page. Also, the links for labels 08 and 12 are duplicated. Apart from one of your private datasets, should there be links for two more datasets? I would greatly appreciate your help in this matter.

Clip embedding network.

Is the CLIP a frozen layer pretrained in the original CLIP paper?
If that's the case, considering CLIP is trained on general images, can we anticipate that it would embed medical targets/organs correctly?

Train and Test question

你好
我在做訓練模型的時候用的是BTCV的dataset,
max_epoch=2000
store_num=50
warmup_epoch=100
最後再做test的時候出來的效果很差
Spleen: dice 0.0069, recall 0.9436, precision 0.0035.
Right Kidney: dice 0.0000, recall 0.0000, precision 0.0000.
Left Kidney: dice 0.0000, recall 0.0000, precision 0.0000.
Esophagus: dice 0.0000, recall 0.0000, precision 0.0000.
Liver: dice 0.0533, recall 0.9984, precision 0.0274.
Stomach: dice 0.0000, recall 0.0000, precision 0.0000.
Arota: dice 0.0000, recall 0.0000, precision nan.
Postcava: dice 0.0022, recall 0.9912, precision 0.0011.
Portal Vein and Splenic Vein: dice 0.0006, recall 1.0000, precision 0.0003.
Pancreas: dice 0.0000, recall 0.0000, precision 0.0000.
Right Adrenal Gland: dice 0.0000, recall 0.0000, precision nan.
Left Adrenal Gland: dice 0.0000, recall 0.0000, precision 0.0000.
case01_Multi-Atlas_Labeling/label/label0023| Spleen: 0.0069, Right Kidney: 0.0000, Left Kidney: 0.0000, Esophagus: 0.0000, Liver: 0.0533, Stomach: 0.0000, Arota: 0.0000, Postcava: 0.0022, Portal Vein and Splenic Vein: 0.0006, Pancreas: 0.0000, Right Adrenal Gland: 0.0000, Left Adrenal Gland: 0.0000,
ase01_Multi-Atlas_Labeling/label/label0023
想請教說怎麼會這樣

Confuse about word embedding

I try to customize this code for my own datasets. But I found the word_embedding is fixed by load txt_encoding.pth in the train file.
Is it right? For my own customization, should I modify this .pth file? And is there a good way to train it or produce it? And in the test file, I met the similar question.

Thanks for any help.

Inference with pre-trained model?

I am trying to run inference in an external dataset using the pre-trained weights. However, both test.py and pred_pseudo.py refer to resume checkpoint: ./out/Nvidia/old_fold0/aepoch_500.pth

Is this on purpose or are you planning to upload the checkpoint?

Training time

Hi,

Thank you for your work.

Could you please provide more details regarding the training time?

Best regards,
Abdelrahman

Inference with trained model

Hi, This is a really good work.
I just wonder how to use a trained model to test my own dataset since I don't want to retrain the whole model.
Besides, Can the model accept a text imput such as the shape and the location of our aim object or just telling this is a xxx.

Table 3. Benchmark on BTCV validation dataset

Hi, I would like to know in the benchmark in Table 3 of your paper, how many epochs did you use to train the different models? Since I notice that the performance of SwinUNETR is similar to the one in the official repository I suppose you used 5000 epochs for each model. Thanks in advance for the answer.

Question about 02 TCIA Pancreas Dataset

Hi, I am trying to replicate your code, and am downloading the datasets. However, when I tried to arrange the datasets following
dataset/dataset_list/PAOT.txt, the format for 02 Pancreas-CT TCIA are dicom images of each slice. Can I check if you have any preprocessing code to convert the dicom format into .nii.gz? Thank you!

Question about AbdomenCT-1K

Thanks for your great work.
I collect the dataset AbdomenCT-1K for "08_AbdomenCT-1K" and "13_AbdomenCT-12organ".
I can find 722 cases in link and 50 cases in link. But there are 1k cases in your paper referred and Case_00001_0000~Case_01062_0000 for 08_AbdomenCT-1K in PAOT.txt.
So how to get 1k cases in 08_AbdomenCT-1K ?

Thanks.

Not able to pre process the dataset 04 LiTS

python -W ignore label_transfer.py
train len 131
Traceback (most recent call last):
File "label_transfer.py", line 290, in
for index, batch in enumerate(train_loader):
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 530, in next
data = self._next_data()
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1224, in _next_data
return self._process_data(data)
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1250, in _process_data
data.reraise()
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/torch/_utils.py", line 457, in reraise
raise exception
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/transforms/transform.py", line 89, in apply_transform
return _apply_transform(transform, data, unpack_items)
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/transforms/transform.py", line 53, in _apply_transform
return transform(parameters)
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/transforms/io/dictionary.py", line 131, in call
data = self.loader(d[key], reader)
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/transforms/io/array.py", line 213, in call
img = reader.read(filename)
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/data/image_reader.py", line 421, in read
img = nib.load(name, **kwargs
)
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/nibabel/loadsave.py", line 115, in load
raise ImageFileError(msg)
nibabel.filebasedimages.ImageFileError: File /home/amit_g/scr/datasets/clip/04_LiTS/label/liver_0.nii.gz is not a gzip file

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/transforms/transform.py", line 89, in apply_transform
return _apply_transform(transform, data, unpack_items)
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/transforms/transform.py", line 53, in apply_transform
return transform(parameters)
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/transforms/compose.py", line 173, in call
input
= apply_transform(transform, input, self.map_items, self.unpack_items, self.log_stats)
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/transforms/transform.py", line 113, in apply_transform
raise RuntimeError(f"applying transform {transform}") from e
RuntimeError: applying transform <monai.transforms.io.dictionary.LoadImaged object at 0x7fdab8d054f0>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/data/dataset.py", line 97, in getitem
return self._transform(index)
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/data/dataset.py", line 83, in _transform
return apply_transform(self.transform, data_i) if self.transform is not None else data_i
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/transforms/transform.py", line 113, in apply_transform
raise RuntimeError(f"applying transform {transform}") from e
RuntimeError: applying transform <monai.transforms.compose.Compose object at 0x7fdab746bfd0>

question about implementation details

image
Hi,
In your paper u said that u use batch size 6 with a patch size of 96 × 96 × 96 per device NVIDIA RTX A5000, which is with 24G video memory. But I use batch size 2 with a patch size of 96 × 96 × 96 per device NVIDIA RTX A5000, encountering cuda is out of memory. And I tried three backbones that were unet, swinunetr, unetpp, encountering the same issue.

preprocessing for SwinUnetr backbone

Impressive work!

May I ask the image size you crop for swin-unetr backbone?
In the code, I find the default backbone is Unet and all crop size is 192, 192, 64, seems for 3DUnet, does it fit for swin-unter?
I have tried this size for Swin-unter following NVIDIA github repo, and it show some weird results and does not converge.

Would you mind provide the training code with swin-unter backbone or preprocessing code?

BTCV benchmark dataset partition

Hi, first of all, I appreciate your work.
In your paper on Table 3, there is a benchmark on the BTCV validation test set. Is it possible to share the train/validation split you use?
Or, did you use 5-fold cross-validation on the 30 available CT scans and you reported on the table the results on the best fold?
Have you used five-fold cross-validation, reporting the average per organ validation accuracy across the five folds on the table?
Thanks in advance for the answer.

Universal Pretrained Weight

Hello, I want to check whether the pretrained weigt in this https://github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/swin_unetr.base_5000ep_f48_lr2e-4_pretrained.pt link is Universal model pretrained weight?

For the results of BTCV in the reproduction paper

For the results of BTCV in the reproduction paper, I have some questions I want to ask, the 5-fold cross-verification method you mentioned, is it done five times with train.py and val.py? According to the classification in BTCV.json. Looking forward to your reply

Code for tumor detection

It seems the code in test.py is for segmentation. Can you release the code for tumor detection?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.