ljwztc / clip-driven-universal-model Goto Github PK

View Code? Open in Web Editor NEW

542.0 8.0 64.0 21.48 MB

[ICCV 2023] CLIP-Driven Universal Model; Rank first in MSD Competition.

License: Other

Python 98.82% Jupyter Notebook 1.18%

clip-driven-universal-model's Introduction

News

🔥 The pseudo-label with manual refienment could be found in AbdonmenAtlas 1.0
🔥 We collect recent medical universal models in AWESOME MEDICAL UNIVERSAL MODEL .
😎 We have document for common questions for code and common questions for paper.

CLIP-Driven Universal Model

Paper

This repository provides the official implementation of Universal Model.

CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection
${\color{red} {\textbf{Rank First in Medical Segmentation Decathlon (MSD) Competition}}}$ (see leaderboard)
Jie Liu¹, Yixiao Zhang², Jie-Neng Chen², Junfei Xiao², Yongyi Lu²,
Yixuan Yuan¹, Alan Yuille², Yucheng Tang³, Zongwei Zhou²
¹City University of Hong Kong, ²Johns Hopkins University, ³NVIDIA
ICCV, 2023
paper | code | slides | poster | talk | blog

Large Language-Image Model for Multi-Organ Segmentation and Cancer Detection from Computed Tomography
Jie Liu¹, Yixiao Zhang², Jie-Neng Chen², Junfei Xiao², Yongyi Lu²,
Yixuan Yuan¹, Alan Yuille², Yucheng Tang³, Zongwei Zhou²
¹City University of Hong Kong, ²Johns Hopkins University, ³NVIDIA
RSNA, 2023
abstract | code | slides

Model

Architecture	Param	Download
U-Net	19.08M	link
Swin UNETR	62.19M	link

Dataset

The post_label can be downloaded via link.

Direct Inference in Your OWN CT scans

Put your all CT scans with nii.gz prefix in one directory. For example, /home/data/ct/.
Run following code.

conda create -n universalmodel python=3.7
conda activate universalmodel
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113 
## please modify according to the CUDA version in your server
pip install 'monai[all]'
pip install -r requirements.txt
cd pretrained_weights/
wget https://www.dropbox.com/s/jdsodw2vemsy8sz/swinunetr.pth
python pred_pseudo.py --data_root_path PATH_TO_IMG_DIR --result_save_path PATH_TO_result_DIR 
## For example: python pred_pseudo.py --data_root_path /home/data/ct/ --result_save_path /home/data/result

0. Preliminary

python3 -m venv universal
source /data/zzhou82/environments/universal/bin/activate

git clone https://github.com/ljwztc/CLIP-Driven-Universal-Model.git
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113
pip install 'monai[all]'
pip install -r requirements.txt
cd pretrained_weights/
wget https://github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/swin_unetr.base_5000ep_f48_lr2e-4_pretrained.pt
wget wget https://www.dropbox.com/s/lh5kuyjxwjsxjpl/Genesis_Chest_CT.pt
cd ../

Dataset Pre-Process

Download the dataset according to the dataset link and arrange the dataset according to the dataset/dataset_list/PAOT.txt.
Modify ORGAN_DATASET_DIR and NUM_WORKER in label_transfer.py
python -W ignore label_transfer.py

Current Template

Index	Organ	Index	Organ
1	Spleen	17	Left Lung
2	Right Kidney	18	Colon
3	Left Kidney	19	Intestine
4	Gall Bladder	20	Rectum
5	Esophagus	21	Bladder
6	Liver	22	Prostate
7	Stomach	23	Left Head of Femur
8	Aorta	24	Right Head of Femur
9	Postcava	25	Celiac Trunk
10	Portal Vein and Splenic Vein	26	Kidney Tumor
11	Pancreas	27	Liver Tumor
12	Right Adrenal Gland	28	Pancreas Tumor
13	Left Adrenal Gland	29	Hepatic Vessel Tumor
14	Duodenum	30	Lung Tumor
15	Hepatic Vessel	31	Colon Tumor
16	Right Lung	32	Kidney Cyst

How expand to new dataset with new organ?

Set the following index for new organ. (e.g. 33 for vermiform appendix)
Check if there are any organs that are not divided into left and right in the dataset. (e.g. kidney, lung, etc.) The RL_Splitd in label_transfer.py is used to processed this case.
Set up a new transfer list for new dataset in TEMPLATE (line 58 in label_transfer.py). (If a new dataset with Intestine labeled as 1 and vermiform appendix labeled as 2, we set the transfer list as [19, 33])
Run the program label_transfer.py to get new post-processing labels.

More details please take a look at common questions

1. Training

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -W ignore -m torch.distributed.launch --nproc_per_node=8 --master_port=1234 train.py --dist True --data_root_path /mnt/zzhou82/PublicAbdominalData/ --num_workers 12 --num_samples 4 --cache_dataset --cache_rate 0.6 --uniform_sample

2. Validation

CUDA_VISIBLE_DEVICES=0 python -W ignore validation.py --data_root_path /mnt/zzhou82/PublicAbdominalData/ --start_epoch 10 --end_epoch 40 --epoch_interval 10 --cache_dataset --cache_rate 0.6

3. Evaluation

CUDA_VISIBLE_DEVICES=0 python -W ignore test.py --resume ./out/epoch_61.pth --data_root_path /mnt/zzhou82/PublicAbdominalData/ --store_result --cache_dataset --cache_rate 0.6

Todo

Acknowledgement

A lot of code is modified from . This work was supported by the Lustgarten Foundation for Pancreatic Cancer Research and partially by the Patrick J. McGovern Foundation Award. We appreciate the effort of the MONAI Team to provide open-source code for the community.

Citation

If you find this repository useful, please consider citing this paper:

@article{liu2023clip,
  title={CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection},
  author={Liu, Jie and Zhang, Yixiao and Chen, Jie-Neng and Xiao, Junfei and Lu, Yongyi and Landman, Bennett A and Yuan, Yixuan and Yuille, Alan and Tang, Yucheng and Zhou, Zongwei},
  journal={arXiv preprint arXiv:2301.00785},
  year={2023}
}

clip-driven-universal-model's People

Contributors

Stargazers

Watchers

clip-driven-universal-model's Issues

The installation problems.

Hi, thanks so much for sharing such an awesome job!
Unfortunately, I encountered some issues during the installation. Here are my problems:

After performing the step “source /xxx/universal/bin/activate”, I check the python version, it is 3.11.5, it this correct?
Then I am trying to run the step "pip install torch==1.11.0+cu113...", it occurs "ERROR: Could not find a version that satisfies the requirement...", so I go to the website, download the torch-1.11.0+cu113-cp310-cp310-linux_x86_64.whl packages and install it. But it seems that python version of this packages is 3.10, Will this cause a conflict?
I can't install hyp5=3.6.0, only version 3.1.0, will this version run successfully?
After I finish the installation, I run "python -W ignore label_transfer.py", it shows:

I think maybe the version of pytorch is not correct?

Waiting for your reply! Thank you so much!

Welcome to share the paper and dataset related to medical universal model

[The paper format]
Paper title:
Author list:
Paper link:
Code link:

[The dataset format]
Dataset title:
Dataset link:
Paper link [optional]:

GPU memory fluctuations

hi, thank you for this great work!
I met a problem while I'm using the Swinunetr as backbone:
the graphics memory usage of the GPU would suddenly increase, causing the graphics memory to exceed the VRAM of my GPU; During inference, there may also be significant fluctuations in GPU memory.
Please give me a clue and let me solve this problem.

Inference Issue with unet.pth

I am facing this issue and havent been able to figure out the reason. Please guide.

File "./clip/test.py", line 197, in main
model.load_state_dict(store_dict)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 2152, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Universal_model:
Unexpected key(s) in state_dict: "backbone.down_tr64.ops.0.conv1.weight", "backbone.down_tr64.ops.0.conv1.bias", "backbone.down_tr64.ops.0.bn1.weight", "backbone.down_tr64.ops.0.bn1.bias", "backbone.down_tr64.ops.0.bn1.running_mean", "backbone.down_tr64.ops.0.bn1.running_var", "backbone.down_tr64.ops.0.bn1.num_batches_tracked", "backbone.down_tr64.ops.1.conv1.weight", "backbone.down_tr64.ops.1.conv1.bias", "backbone.down_tr64.ops.1.bn1.weight", "backbone.down_tr64.ops.1.bn1.bias", "backbone.down_tr64.ops.1.bn1.running_mean", "backbone.down_tr64.ops.1.bn1.running_var", "backbone.down_tr64.ops.1.bn1.num_batches_tracked", "backbone.down_tr128.ops.0.conv1.weight", "backbone.down_tr128.ops.0.conv1.bias", "backbone.down_tr128.ops.0.bn1.weight", "backbone.down_tr128.ops.0.bn1.bias", "backbone.down_tr128.ops.0.bn1.running_mean", "backbone.down_tr128.ops.0.bn1.running_var", "backbone.down_tr128.ops.0.bn1.num_batches_tracked", "backbone.down_tr128.ops.1.conv1.weight", "backbone.down_tr128.ops.1.conv1.bias", "backbone.down_tr128.ops.1.bn1.weight", "backbone.down_tr128.ops.1.bn1.bias", "backbone.down_tr128.ops.1.bn1.running_mean", "backbone.down_tr128.ops.1.bn1.running_var", "backbone.down_tr128.ops.1.bn1.num_batches_tracked", "backbone.down_tr256.ops.0.conv1.weight", "backbone.down_tr256.ops.0.conv1.bias", "backbone.down_tr256.ops.0.bn1.weight", "backbone.down_tr256.ops.0.bn1.bias", "backbone.down_tr256.ops.0.bn1.running_mean", "backbone.down_tr256.ops.0.bn1.running_var", "backbone.down_tr256.ops.0.bn1.num_batches_tracked", "backbone.down_tr256.ops.1.conv1.weight", "backbone.down_tr256.ops.1.conv1.bias", "backbone.down_tr256.ops.1.bn1.weight", "backbone.down_tr256.ops.1.bn1.bias", "backbone.down_tr256.ops.1.bn1.running_mean", "backbone.down_tr256.ops.1.bn1.running_var", "backbone.down_tr256.ops.1.bn1.num_batches_tracked", "backbone.down_tr512.ops.0.conv1.weight", "backbone.down_tr512.ops.0.conv1.bias", "backbone.down_tr512.ops.0.bn1.weight", "backbone.down_tr512.ops.0.bn1.bias", "backbone.down_tr512.ops.0.bn1.running_mean", "backbone.down_tr512.ops.0.bn1.running_var", "backbone.down_tr512.ops.0.bn1.num_batches_tracked", "backbone.down_tr512.ops.1.conv1.weight", "backbone.down_tr512.ops.1.conv1.bias", "backbone.down_tr512.ops.1.bn1.weight", "backbone.down_tr512.ops.1.bn1.bias", "backbone.down_tr512.ops.1.bn1.running_mean", "backbone.down_tr512.ops.1.bn1.running_var", "backbone.down_tr512.ops.1.bn1.num_batches_tracked", "backbone.up_tr256.up_conv.weight", "backbone.up_tr256.up_conv.bias", "backbone.up_tr256.ops.0.conv1.weight", "backbone.up_tr256.ops.0.conv1.bias", "backbone.up_tr256.ops.0.bn1.weight", "backbone.up_tr256.ops.0.bn1.bias", "backbone.up_tr256.ops.0.bn1.running_mean", "backbone.up_tr256.ops.0.bn1.running_var", "backbone.up_tr256.ops.0.bn1.num_batches_tracked", "backbone.up_tr256.ops.1.conv1.weight", "backbone.up_tr256.ops.1.conv1.bias", "backbone.up_tr256.ops.1.bn1.weight", "backbone.up_tr256.ops.1.bn1.bias", "backbone.up_tr256.ops.1.bn1.running_mean", "backbone.up_tr256.ops.1.bn1.running_var", "backbone.up_tr256.ops.1.bn1.num_batches_tracked", "backbone.up_tr128.up_conv.weight", "backbone.up_tr128.up_conv.bias", "backbone.up_tr128.ops.0.conv1.weight", "backbone.up_tr128.ops.0.conv1.bias", "backbone.up_tr128.ops.0.bn1.weight", "backbone.up_tr128.ops.0.bn1.bias", "backbone.up_tr128.ops.0.bn1.running_mean", "backbone.up_tr128.ops.0.bn1.running_var", "backbone.up_tr128.ops.0.bn1.num_batches_tracked", "backbone.up_tr128.ops.1.conv1.weight", "backbone.up_tr128.ops.1.conv1.bias", "backbone.up_tr128.ops.1.bn1.weight", "backbone.up_tr128.ops.1.bn1.bias", "backbone.up_tr128.ops.1.bn1.running_mean", "backbone.up_tr128.ops.1.bn1.running_var", "backbone.up_tr128.ops.1.bn1.num_batches_tracked", "backbone.up_tr64.up_conv.weight", "backbone.up_tr64.up_conv.bias", "backbone.up_tr64.ops.0.conv1.weight", "backbone.up_tr64.ops.0.conv1.bias", "backbone.up_tr64.ops.0.bn1.weight", "backbone.up_tr64.ops.0.bn1.bias", "backbone.up_tr64.ops.0.bn1.running_mean", "backbone.up_tr64.ops.0.bn1.running_var", "backbone.up_tr64.ops.0.bn1.num_batches_tracked", "backbone.up_tr64.ops.1.conv1.weight", "backbone.up_tr64.ops.1.conv1.bias", "backbone.up_tr64.ops.1.bn1.weight", "backbone.up_tr64.ops.1.bn1.bias", "backbone.up_tr64.ops.1.bn1.running_mean", "backbone.up_tr64.ops.1.bn1.running_var", "backbone.up_tr64.ops.1.bn1.num_batches_tracked".
size mismatch for precls_conv.0.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([48]).
size mismatch for precls_conv.0.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([48]).
size mismatch for precls_conv.2.weight: copying a param with shape torch.Size([8, 64, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([8, 48, 1, 1, 1]).
size mismatch for GAP.0.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for GAP.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for GAP.3.weight: copying a param with shape torch.Size([256, 512, 1, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 768, 1, 1, 1]).

Question about model weights.

Thanks for sharing your well-trained weights.
There are two problems when trying.

The keys of weights and model.state_dict() are not matched well. This is just a small problem which I fix it by force matching the keys.
I tested swinunetr.pth on BTCV dataset but achieve bad performance. (using test.py)

Task01| Spleen: 0.0322, Right Kidney: 0.0960, Left Kidney: 0.0265, Gall Bladder: 0.0925, Esophagus: 0.0000, Liver: 0.4375, Stomach: 0.0439, Aorta: 0.0153, Postcava: 0.0270, Portal Vein and Splenic Vein: 0.0281, Pancreas: 0.0403, Right Adrenal Gland: 0.0000, Left Adrenal Gland: 0.0015, Duodenum: nan,
Task01_2| Spleen: nan, Left Kidney: nan, Gall Bladder: nan, Esophagus: nan, Liver: nan, Stomach: nan, Pancreas: nan, Duodenum: nan,
Task02| Spleen: nan, Left Kidney: nan, Gall Bladder: nan, Esophagus: nan, Liver: nan, Stomach: nan, Pancreas: nan, Duodenum: nan,
Task03| Liver: nan,
Task04| Liver: nan, Liver Tumor: nan,
Task05| Right Kidney: nan, Left Kidney: nan, Kidney Tumor: nan, Kidney Cyst: nan,
Task06| Spleen: nan, Right Kidney: nan, Left Kidney: nan, Gall Bladder: nan, Liver: nan, Stomach: nan, Pancreas: nan, Right Lung: nan, Left Lung: nan,
Task07| Liver: nan, Spleen: nan, Left Kidney: nan, Right Kidney: nan, Stomach: nan, Gall Bladder: nan, Esophagus: nan, Pancreas: nan, Duodenum: nan, Colon: nan, Intestine: nan, Right Adrenal Gland: nan, Left Adrenal Gland: nan, Rectum: nan, Bladder: nan, Left Head of Femur: nan, Right Head of Femur: nan,
Task08| Liver: nan, Right Kidney: nan, Left Kidney: nan, Spleen: nan, Pancreas: nan,
Task09| Spleen: nan, Right Kidney: nan, Left Kidney: nan, Gall Bladder: nan, Esophagus: nan, Liver: nan, Stomach: nan, Aorta: nan, Postcava: nan, Pancreas: nan, Right Adrenal Gland: nan, Left Adrenal Gland: nan, Duodenum: nan, Bladder: nan, Prostate: nan,
Task12| Liver: nan, Bladder: nan, Right Lung: nan, Left Lung: nan, Right Kidney: nan, Left Kidney: nan,
Task13| Liver: nan, Right Kidney: nan, Left Kidney: nan, Spleen: nan, Pancreas: nan, Aorta: nan, Postcava: nan, Stomach: nan, Gall Bladder: nan, Esophagus: nan, Right Adrenal Gland: nan, Left Adrenal Gland: nan, Celiac Truck: nan,
Task14| Pancreas: nan, Pancreas Tumor: nan,
Task10_03| Liver: nan, Liver Tumor: nan,
Task10_06| Lung Tumor: nan,
Task10_07| Pancreas: nan, Pancreas Tumor: nan,
Task10_08| Hepatic Vessel: nan, Hepatic Vessel Tumor: nan,
Task10_09| Spleen: nan,
Task10_10| Colon Tumor: nan,
Task15| Spleen: nan, Right Kidney: nan, Left Kidney: nan, Gall Bladder: nan, Esophagus: nan, Liver: nan, Stomach: nan, Aorta: nan, Postcava: nan, Portal Vein and Splenic Vein: nan, Pancreas: nan, Right Adrenal Gland: nan, Left Adrenal Gland: nan, Duodenum: nan, Hepatic Vessel: nan, Right Lung: nan, Left Lung: nan,
Average | Spleen: 0.0322, Right Kidney: 0.0960, Left Kidney: 0.0265, Gall Bladder: 0.0925, Esophagus: 0.0000, Liver: 0.4375, Stomach: 0.0439, Aorta: 0.0153, Postcava: 0.0270, Portal Vein and Splenic Vein: 0.0281, Pancreas: 0.0403, Right Adrenal Gland: 0.0000, Left Adrenal Gland: 0.0015, Duodenum: nan, Hepatic Vessel: nan, Right Lung: nan, Left Lung: nan, Colon: nan, Intestine: nan, Rectum: nan, Bladder: nan, Prostate: nan, Left Head of Femur: nan, Right Head of Femur: nan, Celiac Truck: nan, Kidney Tumor: nan, Liver Tumor: nan, Pancreas Tumor: nan, Hepatic Vessel Tumor: nan, Lung Tumor: nan, Colon Tumor: nan, Kidney Cyst: nan,
average: nan,

How can i solve it ?

Confusion about embedding size

Hi, thank you for your excellent work.

I have a question about the size of the word embedding in the work. As the CLIP method given in the link, my embedding output is n*512, but your pre-trained embedding size is n*256. Is there something wrong with my input?

Thanks for any help.

BTCV

Again bothering you. About BTCV paper results reproduction. What I haven't figured out is that only the training set and the validation set are used in the five-fold cross-validation, so what does the test set do? Is the test set part of the training set when the entire model is pre-trained? Looking forward to your response.

Confusion about dataset 02 and 03

Hi，
Thanks for you great work!I'm a post graduate student trying to follow your work.I met some issue during my reproduction.
In dataset 02, I found that the ground truth regions in the dataset are too small, resulting in the model's dice,recall and precision are all zero in the test stage.
In dataset 03,there were two issues with the dataset: mismatch between the annotations and the actual liver positions, and some CT images being completely black.
So,I wonder if I've missed something when I preprocessed the dataset which I obtained from the link given on the github.
Best,
Kevin Chen

Training about convergence problem

Hello, may I ask how many epochs you have trained and the dice loss value at the final convergence

When I trained using three datasets (liver, kidney, and tumor), 04-LITS, 05KITS, and 10-3liver, I trained for 200 epochs. When finally converged, the dice loss value was around 0.7，I think it's very high. However, in the test set, the dice value range for liver and kidney was around 0.96, and the dice value for tumor was around 0.65. I don't understand why the dice loss value was so high during training convergence. According to conventional understanding, when training convergence, dice predicted (1-0.7=) 0.3 correctly, The test results should also be very low, but the test set results are normal. Could you please answer them? Thank you very much

Cosine similarity in Figure 1

Nice work!

I am trying to repeat the cosine similarity you showed in Figure 1. However, I found that the cosine similarity I computed by the CLIP text encoder differs from yours. The code about extracting the text embeddings and computing cosine similarity is listed.

`
import clip
import torch
from sklearn.metrics.pairwise import cosine_similarity

device = "cuda" if torch.cuda.is_available() else "cpu"
model, _ = clip.load("ViT-B/32", device=device)

text = ['Liver Tumor', 'Hepatic Vessel', 'Right Kidney', 'Left Kidney', 'Kidney Tumor', 'Liver']
input_text1 = ['A photo of a ' + i for i in text]
input_text2 = ['There is {} in this computerized tomography'.format(i) for i in text]
input_text3 = ['A computerized tomography of a ' + i for i in text]

with torch.no_grad():
input_text1 = clip.tokenize(input_text1).to(device)
features1 = model.encode_text(input_text1)
np.save('universal_model_clip_v1.npy', features1.cpu().detach().numpy())

input_text2 = clip.tokenize(input_text2).to(device)
features2 = model.encode_text(input_text2)
np.save('universal_model_clip_v2.npy', features2.cpu().detach().numpy())

input_text3 = clip.tokenize(input_text3).to(device)
features3 = model.encode_text(input_text3)
np.save('universal_model_clip_v3.npy', features3.cpu().detach().numpy())
print('1')

cls_weight = np.load('universal_model_clip_v3.npy')
sim3 = cosine_similarity(cls_weight)
plt.imshow(sim3, cmap='viridis', interpolation='nearest')
plt.colorbar()
plt.show()
`
The similarity is shown in the following figure.

For example, the similarity score between "Liver Tumor" and "Right Kidney" is close to the score between "Liver Tumor" and "Hepatic Vessel". However, as shown in Figure 1, their values are very different. What method did you use to compute the cosine similarity score? Thanks very much.

Question about 3D dice score

Hello, I would like to ask you some algorithm questions regarding the 3D dice score.

Firstly, concerning the issue of the ground truth being entirely black, I noticed in your utils.py that when you are writing the dice_score function, it seems that you add 1 to the denominator but not to the numerator. With this approach, in the case where both the ground truth and the prediction are entirely black, will the dice score be calculated as 0?

Secondly, regarding the calculation method of the 3D dice score, in the GitHub repository of the Medical SAM Adapter paper (https://github.com/WuJunde/Medical-SAM-Adapter), I observed that they calculate the dice score for each slice of the 3D volume separately and then average the scores. However, it seems like your dice_score function computes the dice score for the entire volume together. I believe both methods are reasonable, but the MONAI dice score algorithm appears to follow the former approach. So, I wanted to ask if you know which algorithm was used during the MSD challenge (the experiment results in your paper), as I need to determine the dice algorithm to assess if my model achieves state-of-the-art performance.

I appreciate your answers to the above two questions.

About POAT

Appreciate first

How can I test some single data for visualized result？
And I also have trouble in running test.py. It seems fail to find a file ending with .h5
When I run Evaluation which is CUDA_VISIBLE_DEVICES=0 python -W ignore test.py --resume ./out/epoch_61.pth --data_root_path /mnt/zzhou82/PublicAbdominalData/ --store_result --cache_dataset --cache_rate 0.6. It tips that FileNotFoundError: [Errno 2] No such file or directory: './out/epoch_61.pth'

KeyError: 'net' when trying to run test

I was trying to run the test script with the command:

CUDA_VISIBLE_DEVICES=0 python -W ignore test.py --resume ./out/epoch_61.pth --data_root_path /mnt/zzhou82/PublicAbdominalData/ --store_result --cache_dataset --cache_rate 0.6

and got:

(universal) ~/CLIP-Driven-Universal-Model$ CUDA_VISIBLE_DEVICES=0 python -W ignore test.py --resume pretrained_weights/swin_unetr.base_5000ep_f48_lr2e-4_pretrained.pt --data_root_path /gpu_home/bori/CLIP-Driven-Universal-Model/data --store_result --cache_dataset --cache_rate 0.6 --dataset_list "pancreas_val"
Traceback (most recent call last):
  File "/gpu_home/bori/CLIP-Driven-Universal-Model/test.py", line 209, in <module>
    main()
  File "/gpu_home/bori/CLIP-Driven-Universal-Model/test.py", line 185, in main
    load_dict = checkpoint['net']
KeyError: 'net'

If i print out the keys:

ipdb> checkpoint.keys()
dict_keys(['epoch', 'best_acc', 'state_dict'])

I tried substituting the "net" with "state_dict" key but it fails.

Some info about my setup:

 OS: Ubuntu 22.04 jammy
 Kernel: x86_64 Linux 6.1.12-060112-generic
 Uptime: 39d 3h 50m
 Packages: 2087
 Shell: bash 5.1.16
 Disk: 2.7T / 10T (29%)
 CPU: 13th Gen Intel Core i7-13700KF @ 24x 5.3GHz [53.0°C]
 GPU: NVIDIA RTX A6000
 RAM: 5237MiB / 96377MiB

Python 3.10.13

Diceloss does not decrease during training, and Dice is all 0 during validation

Hello！I am training on BTCV,1000epochs, where diceloss oscillates continuously without decreasing, while celoss decreases. When I use my checkpoint to validate, the result is as follows:
Spleen: dice 0.0000, recall 0.0000, precision nan
Right Kidney: dice 0.0000, recall 0.0000, precision nan
Left Kidney: dice 0.0000, recall 0.0000, precision nan
Esophagus: dice 0.0000, recall 0.0000, precision nan
Liver: dice 0.0000, recall 0.0000, precision nan
Stomach: dice 0.0000, recall 0.0000, precision nan
Aorta: dice 0.0000, recall 0.0000, precision nan
Postcava: dice 0.0000, recall 0.0000, precision nan
Portal Vein and Splenic Vein: dice 0.0000, recall 0.0000, precision nan
Pancreas: dice 0.0000, recall 0.0000, precision nan
Right Adrenal Gland: dice 0.0000, recall 0.0000, precision nan
Left Adrenal Gland: dice 0.0000, recall 0.0000, precision nan
case01_Multi-Atlas_Labeling/label/label0035| Spleen: 0.0000, Right Kidney: 0.0000, Left Kidney: 0.0000, Eso
phagus: 0.0000, Liver: 0.0000, Stomach: 0.0000, Aorta: 0.0000, Postcava: 0.0000, Portal Vein and Splenic Ve
in: 0.0000, Pancreas: 0.0000, Right Adrenal Gland: 0.0000, Left Adrenal Gland: 0.0000,
Have you ever encountered a similar problem?
I hope to receive your reply, thank you!
Here is codes for training：
`def train(args, train_loader, model, optimizer, loss_seg_DICE, loss_seg_CE):
model.train()
loss_bce_ave = 0
loss_dice_ave = 0
epoch_iterator = tqdm(
train_loader, desc="Training (X / X Steps) (loss=X.X)", dynamic_ncols=True
)
for step, batch in enumerate(epoch_iterator):
x, y, name = batch["image"].to(args.device), batch["post_label"].float().to(args.device), batch['name']
torch.cuda.empty_cache()
with torch.cuda.amp.autocast():
logit_map = model(x)
torch.cuda.empty_cache()

    term_seg_Dice = loss_seg_DICE.forward(logit_map, y, name, TEMPLATE)
    term_seg_BCE = loss_seg_CE.forward(logit_map, y, name, TEMPLATE)
    loss = term_seg_BCE + term_seg_Dice
    loss.backward()
    optimizer.step()
    optimizer.zero_grad()
    epoch_iterator.set_description(
        "Epoch=%d: Training (%d / %d Steps) (dice_loss=%2.5f, bce_loss=%2.5f)" % (
            args.epoch, step, len(train_loader), term_seg_Dice.item(), term_seg_BCE.item())
    )
    loss_bce_ave += term_seg_BCE.item()
    loss_dice_ave += term_seg_Dice.item()
    torch.cuda.empty_cache()
print('Epoch=%d: ave_dice_loss=%2.5f, ave_bce_loss=%2.5f' % (args.epoch, loss_dice_ave/len(epoch_iterator), loss_bce_ave/len(epoch_iterator)))

return loss_dice_ave/len(epoch_iterator), loss_bce_ave/len(epoch_iterator)

Question about the hyper-parameters for MSD

For MSD datasets, such as Task10, it seems that the crop size would be [96,96,96]. However, I also noticed that the annotations in the https://github.com/ljwztc/CLIP-Driven-Universal-Model/blob/c8e829eee7769fbc3120b9fe7687bb73402dfc87/dataset/dataloader.py#L260C79-L260C81 is 192,192,64. So which is correct size for feeding the network. By the way, can you provide some logs containg the hyper-parameters for training or more details about hyper-paramerters for training, especially for the MSD Task10,6,7. Thank you.

training on parts of the datasets

Hello, I want to train some datasets in PAOT.txt, such as 01_ Multi Atlas_ Labeling and 02_ TCIA_ Pancreas-CT and 03_ CHAOS, may I ask how to modify the label_transfer.py？The TRANSFER_LIST should be what?TRANSFER_LIST = ['01', '02', '03']?and do I need to modify the num_classes?if so,What should I change it to？
I tried to train on the BTCV dataset, but my diceloss and bce loss did not decrease, and the label_transfer.py is set like this

I hope to receive your guidance, thank you!

Organ list

Hello, I would like to ask some questions about training a new dataset. Can I reposition organ 1 to another organ? For example, the original organ 1 is the spleen, but I would like to define it as the left atrium.

05_KiTS which ground truth labels?

Hi, thanks for the great work!

Could you please tell which labels you used for the Kits dataset. In the ground truth, there are three different aggregations for the ground truths:

aggregated_AND_seg.nii.gz
aggregated_MAJ_seg.nii.gz
aggregated_OR_seg.nii.gz

Question about input modality channel ？

Do you have one channel CT input or multi channel input?

01 BTCV dataset labels for testing subjects

Hi, from the dataset list in PAOT.txt I noticed that for BTCV dataset, you provided directories for the labels that corresponds to the testing subjects (img0061 to img0080). Can I check if these labels are generated using the pred_pseudo.py file? Or if there is another way to assess these labels? Thank you!

Question about Table 4

Hi, @ljwztc @MrGiovanni

Thanks for providing such a nice work. Kindly inquire two questions:

How to get testing images/labels of LiTS and KiTS?
How to evaluate the performance that reported in Table 4?

Best.

How do you use the CLIP model？

Thank you very much for your work. I have some doubts. Where is your text input? The input of our model forwrad is an image. I did not find images of different parts and input different text features.

CUDA out of memory inference half way

When testing the second image, I reported an error
33%|████████████████████████ | 1/3 [01:33<03:06, 93.34s/it]
Traceback (most recent call last):
File "mytest.py", line 223, in
main()
File "mytest.py", line 219, in main
validation(model, test_loader, val_transforms, args)
File "mytest.py", line 55, in validation
pred = sliding_window_inference(image, (args.roi_x, args.roi_y, args.roi_z), 1, model, overlap=0.5, mode='gaussian')
File "/python3.8/site-packages/monai/inferers/utils.py", line 215, in sliding_window_inference
output_image_list.append(torch.zeros(output_shape, dtype=compute_dtype, device=device))
RuntimeError: CUDA out of memory. Tried to allocate 2.70 GiB (GPU 0; 11.91 GiB total capacity; 8.87 GiB already allocated; 2.31 GiB free; 8.90 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
I try with torch.no_grad(): and so on...but it is not solved

MSD Leaderboard

Hi,

really great work. I've just a short question about how you achieved the MRI dataset results of the MSD leaderboard?

Best
Constantin

Question about MSD dataset 10_07

你好我想請問一下 utils.py , label_transfer.py的部分
您的分類有32種，我只需要Pancreas , Pancreas Tumor這兩種
我除了需要對編號進行修改還有需要針對哪些地方做更動嗎
NUM_CLASS
TEMPLATE
POST_TUMOR_DICT
MERGE_MAPPING_v1
MERGE_MAPPING_v2
TUMOR_ORGAN
函式裡的編號
例如organ
post_pred_mask
dataset_index
organ_list

Error when validation and test

When validation or test, showing following error:
RuntimeError: Error(s) in loading state_dict for Universal_model:
Unexpected key(s) in state_dict: "swinViT.patch_embed.proj.weight", "swinViT.patch_embed.proj.bias", "swinViT.layers1.0.blocks.0.norm1.weight", "swinViT.layers1.0 .blocks.0.norm1.bias", "swinViT.layers1.0.blocks.0.attn.relative_position_bias_table", "swinViT.layers1.0.blocks.0.attn.relative_position_index", "swinViT.layers1.0.block s.0.attn.qkv.weight", "swinViT.layers1.0.blocks.0.attn.qkv.bias", "swinViT.layers1.0.blocks.0.attn.proj.weight", "swinViT.layers1.0.blocks.0.attn.proj.bias", "swinViT.lay ers1.0.blocks.0.norm2.weight", "swinViT.layers1.0.blocks.0.norm2.bias", "swinViT.layers1.0.blocks.0.mlp.linear1.weight", "swinViT.layers1.0.blocks.0.mlp.linear1.bias", "s winViT.layers1.0.blocks.0.mlp.linear2.weight", "swinViT.layers1.0.blocks.0.mlp.linear2.bias", "swinViT.layers1.0.blocks.1.norm1.weight", "swinViT.layers1.0.blocks.1.norm1 .bias", "swinViT.layers1.0.blocks.1.attn.relative_position_bias_table", "swinViT.layers1.0.blocks.1.attn.relative_position_index", "swinViT.layers1.0.blocks.1.attn.qkv.we ight", "swinViT.layers1.0.blocks.1.attn.qkv.bias", "swinViT.layers1.0.blocks.1.attn.proj.weight", "swinViT.layers1.0.blocks.1.attn.proj.bias", "swinViT.layers1.0.blocks.1 .norm2.weight", "swinViT.layers1.0.blocks.1.norm2.bias", "swinViT.layers1.0.blocks.1.mlp.linear1.weight", "swinViT.layers1.0.blocks.1.mlp.linear1.bias", "swinViT.layers1. 0.blocks.1.mlp.linear2.weight", "swinViT.layers1.0.blocks.1.mlp.linear2.bias", "swinViT.layers1.0.downsample.reduction.weight", "swinViT.layers1.0.downsample.norm.weight" , "swinViT.layers1.0.downsample.norm.bias", "swinViT.layers2.0.blocks.0.norm1.weight", "swinViT.layers2.0.blocks.0.norm1.bias", "swinViT.layers2.0.blocks.0.attn.relative_ position_bias_table", "swinViT.layers2.0.blocks.0.attn.relative_position_index", "swinViT.layers2.0.blocks.0.attn.qkv.weight", "swinViT.layers2.0.blocks.0.attn.qkv.bias", "swinViT.layers2.0.blocks.0.attn.proj.weight", "swinViT.layers2.0.blocks.0.attn.proj.bias", "swinViT.layers2.0.blocks.0.norm2.weight", "swinViT.layers2.0.blocks.0.norm2. bias", "swinViT.layers2.0.blocks.0.mlp.linear1.weight", "swinViT.layers2.0.blocks.0.mlp.linear1.bias", "swinViT.layers2.0.blocks.0.mlp.linear2.weight", "swinViT.layers2.0 .blocks.0.mlp.linear2.bias", "swinViT.layers2.0.blocks.1.norm1.weight", "swinViT.layers2.0.blocks.1.norm1.bias", "swinViT.layers2.0.blocks.1.attn.relative_position_bias_t able", "swinViT.layers2.0.blocks.1.attn.relative_position_index", "swinViT.layers2.0.blocks.1.attn.qkv.weight", "swinViT.layers2.0.blocks.1.attn.qkv.bias", "swinViT.layer s2.0.blocks.1.attn.proj.weight", "swinViT.layers2.0.blocks.1.attn.proj.bias", "swinViT.layers2.0.blocks.1.norm2.weight", "swinViT.layers2.0.blocks.1.norm2.bias", "swinViT .layers2.0.blocks.1.mlp.linear1.weight", "swinViT.layers2.0.blocks.1.mlp.linear1.bias", "swinViT.layers2.0.blocks.1.mlp.linear2.weight", "swinViT.layers2.0.blocks.1.mlp.l inear2.bias", "swinViT.layers2.0.downsample.reduction.weight", "swinViT.layers2.0.downsample.norm.weight", "swinViT.layers2.0.downsample.norm.bias", "swinViT.layers3.0.bl ocks.0.norm1.weight", "swinViT.layers3.0.blocks.0.norm1.bias", "swinViT.layers3.0.blocks.0.attn.relative_position_bias_table", "swinViT.layers3.0.blocks.0.attn.relative_p osition_index", "swinViT.layers3.0.blocks.0.attn.qkv.weight", "swinViT.layers3.0.blocks.0.attn.qkv.bias", "swinViT.layers3.0.blocks.0.attn.proj.weight", "swinViT.layers3. 0.blocks.0.attn.proj.bias", "swinViT.layers3.0.blocks.0.norm2.weight", "swinViT.layers3.0.blocks.0.norm2.bias", "swinViT.layers3.0.blocks.0.mlp.linear1.weight", "swinViT. layers3.0.blocks.0.mlp.linear1.bias", "swinViT.layers3.0.blocks.0.mlp.linear2.weight", "swinViT.layers3.0.blocks.0.mlp.linear2.bias", "swinViT.layers3.0.blocks.1.norm1.we ight", "swinViT.layers3.0.blocks.1.norm1.bias", "swinViT.layers3.0.blocks.1.attn.relative_position_bias_table", "swinViT.layers3.0.blocks.1.attn.relative_position_index", "swinViT.layers3.0.blocks.1.attn.qkv.weight", "swinViT.layers3.0.blocks.1.attn.qkv.bias", "swinViT.layers3.0.blocks.1.attn.proj.weight", "swinViT.layers3.0.blocks.1.attn .proj.bias", "swinViT.layers3.0.blocks.1.norm2.weight", "swinViT.layers3.0.blocks.1.norm2.bias", "swinViT.layers3.0.blocks.1.mlp.linear1.weight", "swinViT.layers3.0.block s.1.mlp.linear1.bias", "swinViT.layers3.0.blocks.1.mlp.linear2.weight", "swinViT.layers3.0.blocks.1.mlp.linear2.bias", "swinViT.layers3.0.downsample.reduction.weight", "s winViT.layers3.0.downsample.norm.weight", "swinViT.layers3.0.downsample.norm.bias", "swinViT.layers4.0.blocks.0.norm1.weight", "swinViT.layers4.0.blocks.0.norm1.bias", "s winViT.layers4.0.blocks.0.attn.relative_position_bias_table", "swinViT.layers4.0.blocks.0.attn.relative_position_index", "swinViT.layers4.0.blocks.0.attn.qkv.weight", "sw inViT.layers4.0.blocks.0.attn.qkv.bias", "swinViT.layers4.0.blocks.0.attn.proj.weight", "swinViT.layers4.0.blocks.0.attn.proj.bias", "swinViT.layers4.0.blocks.0.norm2.wei ght", "swinViT.layers4.0.blocks.0.norm2.bias", "swinViT.layers4.0.blocks.0.mlp.linear1.weight", "swinViT.layers4.0.blocks.0.mlp.linear1.bias", "swinViT.layers4.0.blocks.0 .mlp.linear2.weight", "swinViT.layers4.0.blocks.0.mlp.linear2.bias", "swinViT.layers4.0.blocks.1.norm1.weight", "swinViT.layers4.0.blocks.1.norm1.bias", "swinViT.layers4. 0.blocks.1.attn.relative_position_bias_table", "swinViT.layers4.0.blocks.1.attn.relative_position_index", "swinViT.layers4.0.blocks.1.attn.qkv.weight", "swinViT.layers4.0 .blocks.1.attn.qkv.bias", "swinViT.layers4.0.blocks.1.attn.proj.weight", "swinViT.layers4.0.blocks.1.attn.proj.bias", "swinViT.layers4.0.blocks.1.norm2.weight", "swinViT. layers4.0.blocks.1.norm2.bias", "swinViT.layers4.0.blocks.1.mlp.linear1.weight", "swinViT.layers4.0.blocks.1.mlp.linear1.bias", "swinViT.layers4.0.blocks.1.mlp.linear2.we ight", "swinViT.layers4.0.blocks.1.mlp.linear2.bias", "swinViT.layers4.0.downsample.reduction.weight", "swinViT.layers4.0.downsample.norm.weight", "swinViT.layers4.0.down sample.norm.bias", "encoder1.layer.conv1.conv.weight", "encoder1.layer.conv2.conv.weight", "encoder1.layer.conv3.conv.weight", "encoder2.layer.conv1.conv.weight", "encode r2.layer.conv2.conv.weight", "encoder3.layer.conv1.conv.weight", "encoder3.layer.conv2.conv.weight", "encoder4.layer.conv1.conv.weight", "encoder4.layer.conv2.conv.weight ", "encoder10.layer.conv1.conv.weight", "encoder10.layer.conv2.conv.weight", "decoder5.transp_conv.conv.weight", "decoder5.conv_block.conv1.conv.weight", "decoder5.conv_b lock.conv2.conv.weight", "decoder5.conv_block.conv3.conv.weight", "decoder4.transp_conv.conv.weight", "decoder4.conv_block.conv1.conv.weight", "decoder4.conv_block.conv2. conv.weight", "decoder4.conv_block.conv3.conv.weight", "decoder3.transp_conv.conv.weight", "decoder3.conv_block.conv1.conv.weight", "decoder3.conv_block.conv2.conv.weight ", "decoder3.conv_block.conv3.conv.weight", "decoder2.transp_conv.conv.weight", "decoder2.conv_block.conv1.conv.weight", "decoder2.conv_block.conv2.conv.weight", "decoder 2.conv_block.conv3.conv.weight", "decoder1.transp_conv.conv.weight", "decoder1.conv_block.conv1.conv.weight", "decoder1.conv_block.conv2.conv.weight", "decoder1.conv_bloc k.conv3.conv.weight", "0.weight", "0.bias", "2.weight", "2.bias", "3.weight", "3.bias", "weight", "bias".

doubts regarding training model only for Decathlon or Dataset: 10

Hi,

Thanks for interesting repo and paper
Could you guide me above the changes required to be done, if want to train model specifically for Decathlon dataset
Also, is there a way for sharing dataset in full, open datasets which are allowed to share

Dataset

hello, I find your work very interesting, and I encountered some problems while reproducing it. I noticed that both the paper and the txt files in the datalist directory mention a total of 14 datasets, but it seems that there are only links to 12 datasets on this readme page. Also, the links for labels 08 and 12 are duplicated. Apart from one of your private datasets, should there be links for two more datasets? I would greatly appreciate your help in this matter.

Can you release Pre-trained Model

Thanks for your code. Can you release the Pre-trained Model? Thanks

Clip embedding network.

Is the CLIP a frozen layer pretrained in the original CLIP paper?
If that's the case, considering CLIP is trained on general images, can we anticipate that it would embed medical targets/organs correctly?

Train and Test question

你好
我在做訓練模型的時候用的是BTCV的dataset，
max_epoch=2000
store_num=50
warmup_epoch=100
最後再做test的時候出來的效果很差
Spleen: dice 0.0069, recall 0.9436, precision 0.0035.
Right Kidney: dice 0.0000, recall 0.0000, precision 0.0000.
Left Kidney: dice 0.0000, recall 0.0000, precision 0.0000.
Esophagus: dice 0.0000, recall 0.0000, precision 0.0000.
Liver: dice 0.0533, recall 0.9984, precision 0.0274.
Stomach: dice 0.0000, recall 0.0000, precision 0.0000.
Arota: dice 0.0000, recall 0.0000, precision nan.
Postcava: dice 0.0022, recall 0.9912, precision 0.0011.
Portal Vein and Splenic Vein: dice 0.0006, recall 1.0000, precision 0.0003.
Pancreas: dice 0.0000, recall 0.0000, precision 0.0000.
Right Adrenal Gland: dice 0.0000, recall 0.0000, precision nan.
Left Adrenal Gland: dice 0.0000, recall 0.0000, precision 0.0000.
case01_Multi-Atlas_Labeling/label/label0023| Spleen: 0.0069, Right Kidney: 0.0000, Left Kidney: 0.0000, Esophagus: 0.0000, Liver: 0.0533, Stomach: 0.0000, Arota: 0.0000, Postcava: 0.0022, Portal Vein and Splenic Vein: 0.0006, Pancreas: 0.0000, Right Adrenal Gland: 0.0000, Left Adrenal Gland: 0.0000,
ase01_Multi-Atlas_Labeling/label/label0023
想請教說怎麼會這樣

Confuse about word embedding

I try to customize this code for my own datasets. But I found the word_embedding is fixed by load txt_encoding.pth in the train file.
Is it right? For my own customization, should I modify this .pth file? And is there a good way to train it or produce it? And in the test file, I met the similar question.

Thanks for any help.

Questions about BTCV performance of the Table 3 in the paper

Thanks for the great jobs.
Is it any specific script to reproduce the results of the universal model in Table 3?

Inference with pre-trained model?

I am trying to run inference in an external dataset using the pre-trained weights. However, both test.py and pred_pseudo.py refer to resume checkpoint: ./out/Nvidia/old_fold0/aepoch_500.pth

Is this on purpose or are you planning to upload the checkpoint?

Training time

Hi,

Thank you for your work.

Could you please provide more details regarding the training time?

Best regards,
Abdelrahman

Inference with trained model

Hi, This is a really good work.
I just wonder how to use a trained model to test my own dataset since I don't want to retrain the whole model.
Besides, Can the model accept a text imput such as the shape and the location of our aim object or just telling this is a xxx.

Table 3. Benchmark on BTCV validation dataset

Hi, I would like to know in the benchmark in Table 3 of your paper, how many epochs did you use to train the different models? Since I notice that the performance of SwinUNETR is similar to the one in the official repository I suppose you used 5000 epochs for each model. Thanks in advance for the answer.

Question about 02 TCIA Pancreas Dataset

Hi, I am trying to replicate your code, and am downloading the datasets. However, when I tried to arrange the datasets following
dataset/dataset_list/PAOT.txt, the format for 02 Pancreas-CT TCIA are dicom images of each slice. Can I check if you have any preprocessing code to convert the dicom format into .nii.gz? Thank you!

Question about AbdomenCT-1K

Thanks for your great work.
I collect the dataset AbdomenCT-1K for "08_AbdomenCT-1K" and "13_AbdomenCT-12organ".
I can find 722 cases in link and 50 cases in link. But there are 1k cases in your paper referred and Case_00001_0000~Case_01062_0000 for 08_AbdomenCT-1K in PAOT.txt.
So how to get 1k cases in 08_AbdomenCT-1K ?

Thanks.

Not able to pre process the dataset 04 LiTS

python -W ignore label_transfer.py
train len 131
Traceback (most recent call last):
File "label_transfer.py", line 290, in
for index, batch in enumerate(train_loader):
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 530, in next
data = self._next_data()
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1224, in _next_data
return self._process_data(data)
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1250, in _process_data
data.reraise()
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/torch/_utils.py", line 457, in reraise
raise exception
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/transforms/transform.py", line 89, in apply_transform
return _apply_transform(transform, data, unpack_items)
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/transforms/transform.py", line 53, in _apply_transform
return transform(parameters)
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/transforms/io/dictionary.py", line 131, in call
data = self.loader(d[key], reader)
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/transforms/io/array.py", line 213, in call
img = reader.read(filename)
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/data/image_reader.py", line 421, in read
img = nib.load(name, **kwargs)
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/nibabel/loadsave.py", line 115, in load
raise ImageFileError(msg)
nibabel.filebasedimages.ImageFileError: File /home/amit_g/scr/datasets/clip/04_LiTS/label/liver_0.nii.gz is not a gzip file

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/transforms/transform.py", line 89, in apply_transform
return _apply_transform(transform, data, unpack_items)
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/transforms/transform.py", line 53, in apply_transform
return transform(parameters)
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/transforms/compose.py", line 173, in call
input = apply_transform(transform, input, self.map_items, self.unpack_items, self.log_stats)
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/transforms/transform.py", line 113, in apply_transform
raise RuntimeError(f"applying transform {transform}") from e
RuntimeError: applying transform <monai.transforms.io.dictionary.LoadImaged object at 0x7fdab8d054f0>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/data/dataset.py", line 97, in getitem
return self._transform(index)
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/data/dataset.py", line 83, in _transform
return apply_transform(self.transform, data_i) if self.transform is not None else data_i
File "/home/amit_g/scratch/env/clip/lib/python3.8/site-packages/monai/transforms/transform.py", line 113, in apply_transform
raise RuntimeError(f"applying transform {transform}") from e
RuntimeError: applying transform <monai.transforms.compose.Compose object at 0x7fdab746bfd0>

question about implementation details

Hi,
In your paper u said that u use batch size 6 with a patch size of 96 × 96 × 96 per device NVIDIA RTX A5000, which is with 24G video memory. But I use batch size 2 with a patch size of 96 × 96 × 96 per device NVIDIA RTX A5000, encountering cuda is out of memory. And I tried three backbones that were unet, swinunetr, unetpp, encountering the same issue.

preprocessing for SwinUnetr backbone

Impressive work!

May I ask the image size you crop for swin-unetr backbone?
In the code, I find the default backbone is Unet and all crop size is 192, 192, 64, seems for 3DUnet, does it fit for swin-unter?
I have tried this size for Swin-unter following NVIDIA github repo, and it show some weird results and does not converge.

Would you mind provide the training code with swin-unter backbone or preprocessing code?

can we do the multinodal traininig from the same code?

i have 4 nodes, each node contains 2X24 gpu .

BTCV benchmark dataset partition

Hi, first of all, I appreciate your work.
In your paper on Table 3, there is a benchmark on the BTCV validation test set. Is it possible to share the train/validation split you use?
Or, did you use 5-fold cross-validation on the 30 available CT scans and you reported on the table the results on the best fold?
Have you used five-fold cross-validation, reporting the average per organ validation accuracy across the five folds on the table?
Thanks in advance for the answer.

Could you provide the preprocess py file of Medical Segmentation Decathlon（MSD）Datasets？

Thanks for your great work.

I am new to the MSD dataset and hope someone could help me, please! Thanks a lot!!

Universal Pretrained Weight

Hello, I want to check whether the pretrained weigt in this https://github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/swin_unetr.base_5000ep_f48_lr2e-4_pretrained.pt link is Universal model pretrained weight?

For the results of BTCV in the reproduction paper

For the results of BTCV in the reproduction paper, I have some questions I want to ask, the 5-fold cross-verification method you mentioned, is it done five times with train.py and val.py? According to the classification in BTCV.json. Looking forward to your reply

ljwztc / clip-driven-universal-model Goto Github PK

clip-driven-universal-model's Introduction

News

CLIP-Driven Universal Model

Paper

Model

Dataset

Direct Inference in Your OWN CT scans

0. Preliminary

1. Training

2. Validation

3. Evaluation

Todo

Acknowledgement

Citation

clip-driven-universal-model's People

Contributors

Stargazers

Watchers

Forkers

clip-driven-universal-model's Issues

Recommend Projects

Recommend Topics

Recommend Org