kidswithtokens / medical-sam-adapter Goto Github PK

Adapting Segment Anything Model for Medical Image Segmentation

License: GNU General Public License v3.0

Python 98.93% Jupyter Notebook 0.83% Shell 0.23%

adapter deep-learning medical-imaging segment-anything-model segmentagtion

medical-sam-adapter's Introduction

● Medical SAM Adapter

Medical SAM Adapter, or say MSA, is a project to fineturn SAM using Adaption for the Medical Imaging. This method is elaborated on the paper Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation.

A Quick Overview

News

[TOP] Join in our Discord to ask questions and discuss with others.
[TOP] 24-03-02 We have released our pre-trained Adapters in Medical-Adapter-Zoo. Try it without painful training 😉 Credit: @shinning0821
23-05-10. This project is still quickly updating 🌝. Check TODO list to see what will be released next.
23-05-11. GitHub Dicussion opened. You guys can now talk, code and make friends on the playground 👨‍❤️‍👨.
23-12-22. Released data loader and example case on REFUGE dataset. Credit: @jiayuanz3
24-01-04. Released the Efficient Med-SAM-Adapter❗️ A new, faster, and more lightweight version incorporates Meta EfficientSAM🏇. Full credit goes to @shinning0821.
24-01-07. The image resolution now can be resized by -image_size. Credit: @shinning0821
24-01-11. Added a detailed guide on utilizing the Efficient Med-SAM-Adapter, complete with a comparison of performance and speed. You can find this resource in guidance/efficient_sam.ipynb. Credit: @shinning0821
24-01-14. We've just launched our first official version, v0.1.0-alpha 🥳. This release includes support for MobileSAM, which can be activated by setting -net mobile_sam. Additionally, you now have the flexibility to use ViT, Tiny ViT, and Efficient ViT as encoders. Check the details here. Credit: @shinning0821
24-01-20. Added a guide on utilizing the mobile sam in Med-SAM-Adapter, with a comparison of performance and speed. You can find it in guidance/mobile_sam.ipynb Credit: @shinning0821
24-01-21. We've added LoRA to our framework🤖. Use it by setting -mod as sam_lora. A guidance can be found in here. Credit: @shinning0821
24-01-22. We've added dataloader for LIDC dataset, a multi-rater(4 raters 👨‍⚕️🧑🏽‍⚕️👩‍⚕️🧑🏽‍⚕️) lesions segmentation from low-dose lung CTs 🩻. You can download the preprocessed LIDC dataset at here. Also updated environment, and random_click function. Credit: @jiayuanz3
24-03-06. We've supported multi-class segmentation. Use it by setting -multimask_output to the number of classes favored. Also updated REFUGE example to two classes (optic disc & cup). Credit: @LJQCN101
24-03-06. We've supported many other datasets and rebuild the code of datasets and dataloaders. Seen in guidance/Dataset.md Credit: @shinning0821

Medical Adapter Zoo 🐘🐊🦍🦒🦨🦜🦥

We've released a bunch of pre-trained Adapters for various organs/lesions in Medical-Adapter-Zoo. Just pick the adapter that matches your disease and easily adjust SAM to suit your specific needs 😉.

If you can't find what you're looking for. Please suggest it through any contact method available to us (GitHub issue, HuggingFace community, or Discord). We'll do our very best to include it.

Requirement

Install the environment:

conda env create -f environment.yml

conda activate sam_adapt

Then download SAM checkpoint, and put it at ./checkpoint/sam/

You can run:

wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth

mv sam_vit_b_01ec64.pth ./checkpoint/sam creat the folder if it does not exist

Example Cases

Melanoma Segmentation from Skin Images (2D)

Download ISIC dataset part 1 from https://challenge.isic-archive.com/data/. Then put the csv files in "./data/isic" under your data path. Your dataset folder under "your_data_path" should be like: ISIC/ ISBI2016_ISIC_Part1_Test_Data/...

ISBI2016_ISIC_Part1_Training_Data/...

ISBI2016_ISIC_Part1_Test_GroundTruth.csv

ISBI2016_ISIC_Part1_Training_GroundTruth.csv

You can fine the csv files here
Begin Adapting! run: python train.py -net sam -mod sam_adpt -exp_name *msa_test_isic* -sam_ckpt ./checkpoint/sam/sam_vit_b_01ec64.pth -image_size 1024 -b 32 -dataset isic -data_path *../data* change "data_path" and "exp_name" for your own useage. you can change "exp_name" to anything you want.

You can descrease the image size or batch size b if out of memory.

Evaluation: The code can automatically evaluate the model on the test set during traing, set "--val_freq" to control how many epoches you want to evaluate once. You can also run val.py for the independent evaluation.
Result Visualization: You can set "--vis" parameter to control how many epoches you want to see the results in the training or evaluation process.

In default, everything will be saved at ./logs/

REFUGE: Optic-disc Segmentation from Fundus Images (2D)

REFUGE dataset contains 1200 fundus images with optic disc/cup segmentations and clinical glaucoma labels.

Dowaload the dataset manually from here, or using command lines:

git lfs install

git clone [email protected]:datasets/realslimman/REFUGE-MultiRater

unzip and put the dataset to the target folder

unzip ./REFUGE-MultiRater.zip

mv REFUGE-MultiRater ./data

For training the adapter, run: python train.py -net sam -mod sam_adpt -exp_name REFUGE-MSAdapt -sam_ckpt ./checkpoint/sam/sam_vit_b_01ec64.pth -image_size 1024 -b 32 -dataset REFUGE -data_path ./data/REFUGE-MultiRater you can change "exp_name" to anything you want.

You can descrease the image size or batch size b if out of memory.

Abdominal Multiple Organs Segmentation (3D)

This tutorial demonstrates how MSA can adapt SAM to 3D multi-organ segmentation task using the BTCV challenge dataset. For BTCV dataset, under Institutional Review Board (IRB) supervision, 50 abdomen CT scans of were randomly selected from a combination of an ongoing colorectal cancer chemotherapy trial, and a retrospective ventral hernia study. The 50 scans were captured during portal venous contrast phase with variable volume sizes (512 x 512 x 85 - 512 x 512 x 198) and field of views (approx. 280 x 280 x 280 mm3 - 500 x 500 x 650 mm3). The in-plane resolution varies from 0.54 x 0.54 mm2 to 0.98 x 0.98 mm2, while the slice thickness ranges from 2.5 mm to 5.0 mm. Target: 13 abdominal organs including Spleen Right Kidney Left Kidney Gallbladder Esophagus Liver Stomach Aorta IVC Portal and Splenic Veins Pancreas Right adrenal gland Left adrenal gland. Modality: CT Size: 30 3D volumes (24 Training + 6 Testing) Challenge: BTCV MICCAI Challenge The following figure shows image patches with the organ sub-regions that are annotated in the CT (top left) and the final labels for the whole dataset (right).

Prepare BTCV dataset following MONAI instruction: Download BTCV dataset from: https://www.synapse.org/#!Synapse:syn3193805/wiki/217752. After you open the link, navigate to the "Files" tab, then download Abdomen/RawData.zip. After downloading the zip file, unzip. Then put images from RawData/Training/img in ../data/imagesTr, and put labels from RawData/Training/label in ../data/labelsTr. Download the json file for data splits from this link. Place the JSON file at ../data/dataset_0.json.
For the Adaptation, run: python train.py -net sam -mod sam_adpt -exp_name msa-3d-sam-btcv -sam_ckpt ./checkpoint/sam/sam_vit_b_01ec64.pth -image_size 1024 -b 8 -dataset decathlon -thd True -chunk 96 -data_path ../data -num_sample 4
You can modify following parameters to save the memory usage: '-b' the batch size, '-chunk' the 3D depth (channel) for each sample, '-num_sample' number of samples for Monai.RandCropByPosNegLabeld, 'evl_chunk' the 3D channel split step in the evaluation, decrease it if out of memory in the evaluation.

Run on your own dataset

It is simple to run MSA on the other datasets. Just write another dataset class following which in ./dataset.py. You only need to make sure you return a dict with { 'image': A tensor saving images with size [C,H,W] for 2D image, size [C, H, W, D] for 3D data. D is the depth of 3D volume, C is the channel of a scan/frame, which is commonly 1 for CT, MRI, US data. If processing, say like a colorful surgical video, D could the number of time frames, and C will be 3 for a RGB frame. 'label': The target masks. Same size with the images except the resolutions (H and W). 'p_label': The prompt label to decide positive/negative prompt. To simplify, you can always set 1 if don't need the negative prompt function. 'pt': The prompt. Should be the same as that in SAM, e.g., a click prompt should be [x of click, y of click], one click for each scan/frame if using 3d data. 'image_meta_dict': Optional. if you want save/visulize the result, you should put the name of the image in it with the key ['filename_or_obj']. ...(others as you want) } Welcome to open issues if you meet any problem. It would be appreciated if you could contribute your dataset extensions. Unlike natural images, medical images vary a lot depending on different tasks. Expanding the generalization of a method requires everyone's efforts.

TODO LIST

Cite

@misc{wu2023medical,
     title={Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation}, 
     author={Junde Wu and Wei Ji and Yuanpei Liu and Huazhu Fu and Min Xu and Yanwu Xu and Yueming Jin},
     year={2023},
     eprint={2304.12620},
     archivePrefix={arXiv},
     primaryClass={cs.CV}
}

Buy Me A Coffee 🥤😉

https://ko-fi.com/jundewu

medical-sam-adapter's People

Contributors

Stargazers

Watchers

medical-sam-adapter's Issues

multiple GPUs

How do I train with multiple GPUs?

TypeError: unsupported operand type(s) for %: 'int' and 'NoneType'

Its giving me this error after training.

Traceback (most recent call last):
File "C:\Users\cb0764\Videos\Downloads\segment-anything-main\MedSAM Adapters\Medical-SAM-Adapter-main\train.py", line 137, in
tol, (eiou, edice) = function.validation_sam(args, nice_test_loader, epoch, net, writer)
File "C:\Users\cb0764\Videos\Downloads\segment-anything-main\MedSAM Adapters\Medical-SAM-Adapter-main\function.py", line 278, in validation_sam
if ind % args.vis == 0:

Experimental results on ISIC dataset

Hi, as mentioned in the paper, the experimental results on ISIC dataset are listed in the appendix. But there is no appendix section in the priprint paper, any results of ISIC dataset?

TypeError: 'ThreadDataLoader' object is not subscriptable

When I get this error, I think it is caused by this object not being able to use the middle brackets. So I changed the following source code：
before：

after：

And then, I got following error：

This error means that the object has a duplicate name, right?
I am looking forward to your reply, your reply is very valuable to me.

cannot import name 'SegDecoderViT' from 'models.SamFeatSeg'

Traceback (most recent call last):
File "train.py", line 27, in
from dataset import *
File "D:\pycharmprogram\Medical-SAM-Adapter-main\dataset.py", line 19, in
from utils import random_click
File "D:\pycharmprogram\Medical-SAM-Adapter-main\utils.py", line 57, in
from models.discriminator import Discriminator
File "D:\anaconda\envs\MedSAM-adapter\lib\site-packages\models_init_.py", line 3, in
from .build_autosam_seg_model import sam_seg_model_registry
File "D:\anaconda\envs\MedSAM-adapter\lib\site-packages\models\build_autosam_seg_model.py", line 5, in
from .SamFeatSeg import SamFeatSeg, SegDecoderViT, SegDecoderCNN
ImportError: cannot import name 'SegDecoderViT' from 'models.SamFeatSeg' (D:\anaconda\envs\MedSAM-adapter\lib\site-packages\models\SamFeatSeg.py)

Training stuck after loading 3d data

python train.py -net sam -mod sam_adpt -exp_name msa-3d-sam-btcv -sam_ckpt ./sam_vit_b_01ec64.pth -image_size 1024 -b 1 -dataset decathlon -thd True -chunk 1 -data_path /mnt/data/abdomen -num_sample 1

i run the program, but it gets stuck after loading the 3d data. How to solve it

How do you train your model

self.pos_embed ：Dimension mismatch

The shape of the original input image is (256,256,600). After the previous processing, the shape of the current input x of ImageEncoderViT is (600,3,256,256). After self.patch_embed(), The shape of x is （600,16,16,768）.

However, the shape of self.pos_embed is (1,64,64,768).

so at line 113, x = x + sel.pos_embed cannot be implemented

Questions about criterion in validation_sam

Suppose you use DiceMetric as criterion, do you compute Dice for every single batch and get average? How do you eliminate the affection of batch_size?

Is this method based on LORA or adapter tunning?

I am just a little confused when reading the paper.

"Technically, we choose to fine-tune the pre-trained SAMusing a parameter-efficient fine-tuning (PEFT) technique called Adaption [18]"

The Adaption here cites LORA, but it seems that the proposed method is based on adapter tunning. Do I miss-understand anything?

Pretraining code text encoder is missing.

Can you please let us know how you pre-trained the text encoder? Please provide the code for it.

change the input_size

When I change the input image size - from 1024 to 256, what configuration options should I need to attention?

Traceback (most recent call last):
train.py", line 116, in
loss = function.train_sam(args, net, optimizer, nice_train_loader, epoch, writer, vis = args.vis)
function.py", line 137, in train_sam
imge = net.image_encoder(imgs)
module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "MSA/models/sam/modeling/image_encoder.py", line 115, in forward
x = x + self.pos_embed
RuntimeError: The size of tensor a (16) must match the size of tensor b (64) at non-singleton dimension 2

What's going on here? What do the first and second columns in the requested CSV file represent?Thanks!

No of Epochs and Results for BTCV and ISIC Datasets

Hi @WuJunde could you please include information about the number of epochs you trained the model on both datasets. The number of epochs in the global settings is set to 30000. I ran up to 100 epochs and wasn't able to match the DICE SCORE of 0.883 for the BTCV dataset. Could you please advise?

显存问题

我使用了48G的显存的显卡，parser.add_argument('-chunk', type=int, default=5 , help='crop volume depth') #原本96
parser.add_argument('-num_sample', type=int, default=1 , help='sample pos and neg')
parser.add_argument('-roi_size', type=int, default=96 , help='resolution of roi') parser.add_argument('-b', type=int, default=1, help='batch size for dataloader')都会爆显存，而太低的参数会导致loss=1 模式崩溃

Suggestion - Integrate MobileSAM into the pipeline for lightweight and faster inference

Reference: https://github.com/ChaoningZhang/MobileSAM

Our project performs on par with the original SAM and keeps exactly the same pipeline as the original SAM except for a change on the image encode, therefore, it is easy to Integrate into any project.

MobileSAM is around 60 times smaller and around 50 times faster than original SAM, and it is around 7 times smaller and around 5 times faster than the concurrent FastSAM. The comparison of the whole pipeline is summarzed as follows:

Best Wishes,

Qiao

Which GPU did you use for 3d adaptation?

what I use is RTX 3090 (24G), which is far from enough.

About eval_seg

Thanks for your great job!

but i can't understand your evaluation code, why not test sample one by one and calculate metric for every class?
You have only considered one class in the training, also

RuntimeError: CUDA out of memory. Tried to allocate 768.00 MiB (GPU 0; 6.00 GiB total capacity; 3.63 GiB already allocated; 328.06 MiB free; 3.90 GiB reserved in total by PyTorch) If r eserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I set the batch_size to 1 and still get this error when I run the model.
Here is the gpu information of my computer.

Could you tell me how to fix this error?

change the size of the input image

If I want to change the size of the input image and train it, what should I change

-vis 1 RuntimeError: Sizes of tensors must match except in dimension 0. Expected size 256 but got size 3 for tensor number 1 in the list.

actual shape : img : torch.Size([64, 256, 256, 3]) pred_mask : torch.Size([64, 1, 256, 256]) gt_mask; torch.Size([64, 1, 256, 256])
vis code shape : img : torch.Size([4, 256, 256, 256]) pred_mask : torch.Size([4, 3, 256, 256]) gt_mask; torch.Size([4, 3, 256, 256])

I couldn't find the adapter in the masked decoder

Thank you for your excellent work! I would like to ask about the adapter in the decoder in the paper, but I couldn't find the adapter in the masked decoder in the code.

No such file or directory

how come？

数据集路径

是这样吗

./data/isic/ISBI2016_ISIC_Part1_Test_Data/ISIC_0000169
./data/isic/ISBI2016_ISIC_Part1_Test_GroundTruth/ISIC_0000169
./data/isic/ISBI2016_ISIC_Part1_Training_Data/ISIC_0000169
./data/isic/ISBI2016_ISIC_Part1_Training_GroundTruth/ISIC_0000169

Can't get satisfactory results on kidney tumor segmentation

Hi, thank you for the great work! I trained the MedSAM-adapter on 2D 256x256 CT images obtained from Kidney Tumor Segmentation (KiTS) dataset for 50 epoches, the best model is achieved on epoch 11. But the DSC score of the testing data is not more than 0.7, but it is 0.76 from MedSAM. How should I do for the dataset? If I should reduce the size of the model?

By the way, I have changed the proj layer to fit the size of my dataset as following:

net.image_encoder.patch_embed.proj = nn.Conv2d(
3, 768, kernel_size=4, stride=4, padding=0
)

Training Failed

I am trying to train using the isic data and I used the following command

python3 train.py -net sam -mod sam_adpt -exp_name msa_test_isic -sam_ckpt ./checkpoints/sam_vit_b_01ec64.pth -image_size 1024 -b 32 -dataset isic -data_path data/isic/

I got the following error, could you let me know how to fix this

Traceback (most recent call last):
File "/media/sandeep/MySSD2/Medical-SAM-Adapter/train.py", line 96, in
isic_train_dataset = ISIC2016(args, args.data_path, transform = transform_train, transform_msk= transform_train_seg, mode = 'Training')
File "/media/sandeep/MySSD2/Medical-SAM-Adapter/dataset.py", line 30, in init
self.label_list = df.iloc[:,2].tolist()
File "/home/sandeep/anaconda3/lib/python3.9/site-packages/pandas/core/indexing.py", line 925, in getitem
return self._getitem_tuple(key)
File "/home/sandeep/anaconda3/lib/python3.9/site-packages/pandas/core/indexing.py", line 1506, in _getitem_tuple
self._has_valid_tuple(tup)
File "/home/sandeep/anaconda3/lib/python3.9/site-packages/pandas/core/indexing.py", line 754, in _has_valid_tuple
self._validate_key(k, i)
File "/home/sandeep/anaconda3/lib/python3.9/site-packages/pandas/core/indexing.py", line 1409, in _validate_key
self._validate_integer(key, axis)
File "/home/sandeep/anaconda3/lib/python3.9/site-packages/pandas/core/indexing.py", line 1500, in _validate_integer
raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds

OSError: [WinError 123] 文件名、目录名或卷标语法不正确。: 'runs\\sam\\2023-05-23T22:49:56.670918'

Please tell me how to fix this error，thanks

Do I need to create a corresponding dataset class for the BTCV example dataset? Because I see only the ISIC class in the dataset.py code.

Could you please release your pre-trained model on RadImageNet?

RuntimeError: Calculated padded input size per channel: (1024 x 2). Kernel size: (16 x 16). Kernel size can't be greater than actual input size

This error happens during validation of ISIC dataset, and it disallows further training. Can you please advise? @WuJunde

pretrain

我十分感兴趣您在论文中提到的ssl 方法，您有计划发布预训练代码吗
Encoder Pretrainig 的细节可以讲一下吗

what gpu used to finetune the 2D model?

I treid finetuning the model on 24 gb vram GPU but it runs out of memory and only trains with batch size 2

貌似这个医疗数据集跟代码数据处理不是很匹配

我下载的3B 数据集，里面的ground truth.csv 格式只有两列，dataset 处理代码里面数据是第二列，标签是第三列，而且读取的时候格式也不太对，所以想请问下这个数据是3B那个吗

RuntimeError: The size of tensor a (32) must match the size of tensor b (64) at non-singleton dimension 2

Is the input image size of the SAM model fixed at 1024? It seems that an error occurs when inputting images of other sizes. I tried using a 512-sized image and it resulted in a RuntimeError: The size of tensor a (32) must match the size of tensor b (64) at non-singleton dimension 2.

about Iteration Click

Thanks！

Your paper mentions about the optimization of the click strategy ,mainly is,distinction the first click and performing iterative clicks.

I think that’s cool idea will definitely help me greatly to improve my accuracy .But I can't find the code for that......Yous dataset is random generate click. Function generate_click_prompt in utils pack also only when the first time is loaded.

Where is the code about iterate generating click promot based on loss ? Thanks again!

image size 512 : RuntimeError: The size of tensor a (32) must match the size of tensor b (64) at non-singleton dimension 2

❱ 115 │ │ │ x = x + self.pos_embed │
│ 116 │ │ │
│ 117 │ │ for blk in self.blocks: │
│ 118 │ │ │ x = blk(x) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: The size of tensor a (32) must match the size of tensor b (64) at non-singleton dimension 2

The adapter layer is not found in the part of the decoder

The adapter layer is not found in the part of the decoder, which does not match the description in the paper. Is this an oversight?

Out of Memory

what I use is a single A100 (80G), which is far from enough with chunk 96. May I ask what configuration you are training on?
and I found the code of validation is seem not complete.

NameError: name 'nice_train_loader' is not defined

I get the following error when I try to run 3d dataset, how can I fix this error?

GPU memory

I‘d like to know the GPU memory cost when setting the '-chunk' to 96 (the 3D depth) and batch size=1. Thanks!

Cannot find the 3D datastet following the readme

There is no such a .zip file in the files, as shown as follows:

About instance segmentation

Hi, thanks for sharing.
I wonder that with only one point as prompt, can MSA segment 13 organs at the same time？（For example BTCV dataset），i mean maybe MSA can only work on one class of segmentation target，instance segmentation is not possible now？
Regarding the BTCV dataset, do you handle each organ separately by only segmenting one type of organ at a time, rather than segmenting all 13 types of organs at once？

Adding Adapters in between the layers, what is input to Adapters (do we need to separately pass any task specific information or just output from the below layers) )?

3d dataset

Does someone download the 3d dataset and run it successfully？
I can not find Abdomen/RawData.zip

hurry up!

OutOfMemoryError: CUDA out of memory. Tried to allocate 24.00 GiB (GPU 0; 47.99 GiB total capacity; 38.75 GiB already allocated; 5.85 GiB free; 39.53 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Please somebody explain how to reduce the batch size for 2D SAM adapter. I could not find the options to train with multiple GPUs and the code does not seem to have an options to lower batch and epoch size.

About Total score when evaluate

I found that during the test, the total score will be output. From an intuitive feeling, the larger the score, the better, but in your code logic, the tot should be the total loss, and the best checkpoint should correspond to a smaller score. Excuse me. Why do you want to set it up like this?