Git Product home page Git Product logo

medical-sam-adapter's Introduction

● Medical SAM Adapter

Discord License

Medical SAM Adapter, or say MSA, is a project to fineturn SAM using Adaption for the Medical Imaging. This method is elaborated on the paper Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation.

A Quick Overview

News

  • [TOP] Join in our Discord to ask questions and discuss with others.
  • [TOP] 24-03-02 We have released our pre-trained Adapters in Medical-Adapter-Zoo. Try it without painful training 😉 Credit: @shinning0821
  • 23-05-10. This project is still quickly updating 🌝. Check TODO list to see what will be released next.
  • 23-05-11. GitHub Dicussion opened. You guys can now talk, code and make friends on the playground 👨‍❤️‍👨.
  • 23-12-22. Released data loader and example case on REFUGE dataset. Credit: @jiayuanz3
  • 24-01-04. Released the Efficient Med-SAM-Adapter❗️ A new, faster, and more lightweight version incorporates Meta EfficientSAM🏇. Full credit goes to @shinning0821.
  • 24-01-07. The image resolution now can be resized by -image_size. Credit: @shinning0821
  • 24-01-11. Added a detailed guide on utilizing the Efficient Med-SAM-Adapter, complete with a comparison of performance and speed. You can find this resource in guidance/efficient_sam.ipynb. Credit: @shinning0821
  • 24-01-14. We've just launched our first official version, v0.1.0-alpha 🥳. This release includes support for MobileSAM, which can be activated by setting -net mobile_sam. Additionally, you now have the flexibility to use ViT, Tiny ViT, and Efficient ViT as encoders. Check the details here. Credit: @shinning0821
  • 24-01-20. Added a guide on utilizing the mobile sam in Med-SAM-Adapter, with a comparison of performance and speed. You can find it in guidance/mobile_sam.ipynb Credit: @shinning0821
  • 24-01-21. We've added LoRA to our framework🤖. Use it by setting -mod as sam_lora. A guidance can be found in here. Credit: @shinning0821
  • 24-01-22. We've added dataloader for LIDC dataset, a multi-rater(4 raters 👨‍⚕️🧑🏽‍⚕️👩‍⚕️🧑🏽‍⚕️) lesions segmentation from low-dose lung CTs 🩻. You can download the preprocessed LIDC dataset at here. Also updated environment, and random_click function. Credit: @jiayuanz3
  • 24-03-06. We've supported multi-class segmentation. Use it by setting -multimask_output to the number of classes favored. Also updated REFUGE example to two classes (optic disc & cup). Credit: @LJQCN101
  • 24-03-06. We've supported many other datasets and rebuild the code of datasets and dataloaders. Seen in guidance/Dataset.md Credit: @shinning0821

Medical Adapter Zoo 🐘🐊🦍🦒🦨🦜🦥

We've released a bunch of pre-trained Adapters for various organs/lesions in Medical-Adapter-Zoo. Just pick the adapter that matches your disease and easily adjust SAM to suit your specific needs 😉.

If you can't find what you're looking for. Please suggest it through any contact method available to us (GitHub issue, HuggingFace community, or Discord). We'll do our very best to include it.

Requirement

Install the environment:

conda env create -f environment.yml

conda activate sam_adapt

Then download SAM checkpoint, and put it at ./checkpoint/sam/

You can run:

wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth

mv sam_vit_b_01ec64.pth ./checkpoint/sam creat the folder if it does not exist

Example Cases

Melanoma Segmentation from Skin Images (2D)

  1. Download ISIC dataset part 1 from https://challenge.isic-archive.com/data/. Then put the csv files in "./data/isic" under your data path. Your dataset folder under "your_data_path" should be like: ISIC/ ISBI2016_ISIC_Part1_Test_Data/...

    ISBI2016_ISIC_Part1_Training_Data/...

    ISBI2016_ISIC_Part1_Test_GroundTruth.csv

    ISBI2016_ISIC_Part1_Training_GroundTruth.csv

    You can fine the csv files here

  2. Begin Adapting! run: python train.py -net sam -mod sam_adpt -exp_name *msa_test_isic* -sam_ckpt ./checkpoint/sam/sam_vit_b_01ec64.pth -image_size 1024 -b 32 -dataset isic -data_path *../data* change "data_path" and "exp_name" for your own useage. you can change "exp_name" to anything you want.

You can descrease the image size or batch size b if out of memory.

  1. Evaluation: The code can automatically evaluate the model on the test set during traing, set "--val_freq" to control how many epoches you want to evaluate once. You can also run val.py for the independent evaluation.

  2. Result Visualization: You can set "--vis" parameter to control how many epoches you want to see the results in the training or evaluation process.

In default, everything will be saved at ./logs/

REFUGE: Optic-disc Segmentation from Fundus Images (2D)

REFUGE dataset contains 1200 fundus images with optic disc/cup segmentations and clinical glaucoma labels.

  1. Dowaload the dataset manually from here, or using command lines:

git lfs install

git clone [email protected]:datasets/realslimman/REFUGE-MultiRater

unzip and put the dataset to the target folder

unzip ./REFUGE-MultiRater.zip

mv REFUGE-MultiRater ./data

  1. For training the adapter, run: python train.py -net sam -mod sam_adpt -exp_name REFUGE-MSAdapt -sam_ckpt ./checkpoint/sam/sam_vit_b_01ec64.pth -image_size 1024 -b 32 -dataset REFUGE -data_path ./data/REFUGE-MultiRater you can change "exp_name" to anything you want.

You can descrease the image size or batch size b if out of memory.

Abdominal Multiple Organs Segmentation (3D)

This tutorial demonstrates how MSA can adapt SAM to 3D multi-organ segmentation task using the BTCV challenge dataset. For BTCV dataset, under Institutional Review Board (IRB) supervision, 50 abdomen CT scans of were randomly selected from a combination of an ongoing colorectal cancer chemotherapy trial, and a retrospective ventral hernia study. The 50 scans were captured during portal venous contrast phase with variable volume sizes (512 x 512 x 85 - 512 x 512 x 198) and field of views (approx. 280 x 280 x 280 mm3 - 500 x 500 x 650 mm3). The in-plane resolution varies from 0.54 x 0.54 mm2 to 0.98 x 0.98 mm2, while the slice thickness ranges from 2.5 mm to 5.0 mm. Target: 13 abdominal organs including Spleen Right Kidney Left Kidney Gallbladder Esophagus Liver Stomach Aorta IVC Portal and Splenic Veins Pancreas Right adrenal gland Left adrenal gland. Modality: CT Size: 30 3D volumes (24 Training + 6 Testing) Challenge: BTCV MICCAI Challenge The following figure shows image patches with the organ sub-regions that are annotated in the CT (top left) and the final labels for the whole dataset (right).

  1. Prepare BTCV dataset following MONAI instruction: Download BTCV dataset from: https://www.synapse.org/#!Synapse:syn3193805/wiki/217752. After you open the link, navigate to the "Files" tab, then download Abdomen/RawData.zip. After downloading the zip file, unzip. Then put images from RawData/Training/img in ../data/imagesTr, and put labels from RawData/Training/label in ../data/labelsTr. Download the json file for data splits from this link. Place the JSON file at ../data/dataset_0.json.
  2. For the Adaptation, run: python train.py -net sam -mod sam_adpt -exp_name msa-3d-sam-btcv -sam_ckpt ./checkpoint/sam/sam_vit_b_01ec64.pth -image_size 1024 -b 8 -dataset decathlon -thd True -chunk 96 -data_path ../data -num_sample 4
    You can modify following parameters to save the memory usage: '-b' the batch size, '-chunk' the 3D depth (channel) for each sample, '-num_sample' number of samples for Monai.RandCropByPosNegLabeld, 'evl_chunk' the 3D channel split step in the evaluation, decrease it if out of memory in the evaluation.

Run on your own dataset

It is simple to run MSA on the other datasets. Just write another dataset class following which in ./dataset.py. You only need to make sure you return a dict with { 'image': A tensor saving images with size [C,H,W] for 2D image, size [C, H, W, D] for 3D data. D is the depth of 3D volume, C is the channel of a scan/frame, which is commonly 1 for CT, MRI, US data. If processing, say like a colorful surgical video, D could the number of time frames, and C will be 3 for a RGB frame. 'label': The target masks. Same size with the images except the resolutions (H and W). 'p_label': The prompt label to decide positive/negative prompt. To simplify, you can always set 1 if don't need the negative prompt function. 'pt': The prompt. Should be the same as that in SAM, e.g., a click prompt should be [x of click, y of click], one click for each scan/frame if using 3d data. 'image_meta_dict': Optional. if you want save/visulize the result, you should put the name of the image in it with the key ['filename_or_obj']. ...(others as you want) } Welcome to open issues if you meet any problem. It would be appreciated if you could contribute your dataset extensions. Unlike natural images, medical images vary a lot depending on different tasks. Expanding the generalization of a method requires everyone's efforts.

TODO LIST

  • Jupyter tutorials.
  • Fix bugs in BTCV. Add BTCV example.
  • Release REFUGE2, BraTs dataloaders and examples
  • Changable Image Resolution
  • Fix bugs in Multi-GPU parallel
  • Sample and Vis in training
  • Release general data pre-processing and post-processing
  • Release evaluation
  • Deploy on HuggingFace
  • configuration
  • Release SSL code
  • Release Medical Adapter Zoo

Cite

@misc{wu2023medical,
     title={Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation}, 
     author={Junde Wu and Wei Ji and Yuanpei Liu and Huazhu Fu and Min Xu and Yanwu Xu and Yueming Jin},
     year={2023},
     eprint={2304.12620},
     archivePrefix={arXiv},
     primaryClass={cs.CV}
}

Buy Me A Coffee 🥤😉

https://ko-fi.com/jundewu

medical-sam-adapter's People

Contributors

dzenanz avatar jiayuanz3 avatar jiwei0921 avatar ljqcn101 avatar shinning0821 avatar wujunde avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

medical-sam-adapter's Issues

TypeError: unsupported operand type(s) for %: 'int' and 'NoneType'

Its giving me this error after training.

Traceback (most recent call last):
File "C:\Users\cb0764\Videos\Downloads\segment-anything-main\MedSAM Adapters\Medical-SAM-Adapter-main\train.py", line 137, in
tol, (eiou, edice) = function.validation_sam(args, nice_test_loader, epoch, net, writer)
File "C:\Users\cb0764\Videos\Downloads\segment-anything-main\MedSAM Adapters\Medical-SAM-Adapter-main\function.py", line 278, in validation_sam
if ind % args.vis == 0:

Experimental results on ISIC dataset

Hi, as mentioned in the paper, the experimental results on ISIC dataset are listed in the appendix. But there is no appendix section in the priprint paper, any results of ISIC dataset?

TypeError: 'ThreadDataLoader' object is not subscriptable

image
When I get this error, I think it is caused by this object not being able to use the middle brackets. So I changed the following source code:
before:
image
after:
image

And then, I got following error:
image
This error means that the object has a duplicate name, right?
I am looking forward to your reply, your reply is very valuable to me.

cannot import name 'SegDecoderViT' from 'models.SamFeatSeg'

Traceback (most recent call last):
File "train.py", line 27, in
from dataset import *
File "D:\pycharmprogram\Medical-SAM-Adapter-main\dataset.py", line 19, in
from utils import random_click
File "D:\pycharmprogram\Medical-SAM-Adapter-main\utils.py", line 57, in
from models.discriminator import Discriminator
File "D:\anaconda\envs\MedSAM-adapter\lib\site-packages\models_init_.py", line 3, in
from .build_autosam_seg_model import sam_seg_model_registry
File "D:\anaconda\envs\MedSAM-adapter\lib\site-packages\models\build_autosam_seg_model.py", line 5, in
from .SamFeatSeg import SamFeatSeg, SegDecoderViT, SegDecoderCNN
ImportError: cannot import name 'SegDecoderViT' from 'models.SamFeatSeg' (D:\anaconda\envs\MedSAM-adapter\lib\site-packages\models\SamFeatSeg.py)

Training stuck after loading 3d data

Training stuck after loading 3d data
Snipaste_2023-06-08_16-43-17

python train.py -net sam -mod sam_adpt -exp_name msa-3d-sam-btcv -sam_ckpt ./sam_vit_b_01ec64.pth -image_size 1024 -b 1 -dataset decathlon -thd True -chunk 1 -data_path /mnt/data/abdomen -num_sample 1

i run the program, but it gets stuck after loading the 3d data. How to solve it

self.pos_embed :Dimension mismatch

The shape of the original input image is (256,256,600). After the previous processing, the shape of the current input x of ImageEncoderViT is (600,3,256,256). After self.patch_embed(), The shape of x is (600,16,16,768).

However, the shape of self.pos_embed is (1,64,64,768).

so at line 113, x = x + sel.pos_embed cannot be implemented
question

Is this method based on LORA or adapter tunning?

I am just a little confused when reading the paper.

"Technically, we choose to fine-tune the pre-trained SAMusing a parameter-efficient fine-tuning (PEFT) technique called Adaption [18]"

The Adaption here cites LORA, but it seems that the proposed method is based on adapter tunning. Do I miss-understand anything?

change the input_size

When I change the input image size - from 1024 to 256, what configuration options should I need to attention?

Traceback (most recent call last):
train.py", line 116, in
loss = function.train_sam(args, net, optimizer, nice_train_loader, epoch, writer, vis = args.vis)
function.py", line 137, in train_sam
imge = net.image_encoder(imgs)
module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "MSA/models/sam/modeling/image_encoder.py", line 115, in forward
x = x + self.pos_embed
RuntimeError: The size of tensor a (16) must match the size of tensor b (64) at non-singleton dimension 2

No of Epochs and Results for BTCV and ISIC Datasets

Hi @WuJunde could you please include information about the number of epochs you trained the model on both datasets. The number of epochs in the global settings is set to 30000. I ran up to 100 epochs and wasn't able to match the DICE SCORE of 0.883 for the BTCV dataset. Could you please advise?

显存问题

我使用了48G的显存的显卡,parser.add_argument('-chunk', type=int, default=5 , help='crop volume depth') #原本96
parser.add_argument('-num_sample', type=int, default=1 , help='sample pos and neg')
parser.add_argument('-roi_size', type=int, default=96 , help='resolution of roi') parser.add_argument('-b', type=int, default=1, help='batch size for dataloader')都会爆显存,而太低的参数会导致loss=1 模式崩溃

Suggestion - Integrate MobileSAM into the pipeline for lightweight and faster inference

Reference: https://github.com/ChaoningZhang/MobileSAM

Our project performs on par with the original SAM and keeps exactly the same pipeline as the original SAM except for a change on the image encode, therefore, it is easy to Integrate into any project.

MobileSAM is around 60 times smaller and around 50 times faster than original SAM, and it is around 7 times smaller and around 5 times faster than the concurrent FastSAM. The comparison of the whole pipeline is summarzed as follows:

image

image

Best Wishes,

Qiao

About eval_seg

Thanks for your great job!

but i can't understand your evaluation code, why not test sample one by one and calculate metric for every class?
You have only considered one class in the training, also

RuntimeError: CUDA out of memory. Tried to allocate 768.00 MiB (GPU 0; 6.00 GiB total capacity; 3.63 GiB already allocated; 328.06 MiB free; 3.90 GiB reserved in total by PyTorch) If r eserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I set the batch_size to 1 and still get this error when I run the model.
Here is the gpu information of my computer.
image
Could you tell me how to fix this error?

数据集路径

image
是这样吗

./data/isic/ISBI2016_ISIC_Part1_Test_Data/ISIC_0000169
./data/isic/ISBI2016_ISIC_Part1_Test_GroundTruth/ISIC_0000169
./data/isic/ISBI2016_ISIC_Part1_Training_Data/ISIC_0000169
./data/isic/ISBI2016_ISIC_Part1_Training_GroundTruth/ISIC_0000169

Can't get satisfactory results on kidney tumor segmentation

Hi, thank you for the great work! I trained the MedSAM-adapter on 2D 256x256 CT images obtained from Kidney Tumor Segmentation (KiTS) dataset for 50 epoches, the best model is achieved on epoch 11. But the DSC score of the testing data is not more than 0.7, but it is 0.76 from MedSAM. How should I do for the dataset? If I should reduce the size of the model?

By the way, I have changed the proj layer to fit the size of my dataset as following:

net.image_encoder.patch_embed.proj = nn.Conv2d(
3, 768, kernel_size=4, stride=4, padding=0
)

Training Failed

I am trying to train using the isic data and I used the following command

python3 train.py -net sam -mod sam_adpt -exp_name msa_test_isic -sam_ckpt ./checkpoints/sam_vit_b_01ec64.pth -image_size 1024 -b 32 -dataset isic -data_path data/isic/

I got the following error, could you let me know how to fix this

Traceback (most recent call last):
File "/media/sandeep/MySSD2/Medical-SAM-Adapter/train.py", line 96, in
isic_train_dataset = ISIC2016(args, args.data_path, transform = transform_train, transform_msk= transform_train_seg, mode = 'Training')
File "/media/sandeep/MySSD2/Medical-SAM-Adapter/dataset.py", line 30, in init
self.label_list = df.iloc[:,2].tolist()
File "/home/sandeep/anaconda3/lib/python3.9/site-packages/pandas/core/indexing.py", line 925, in getitem
return self._getitem_tuple(key)
File "/home/sandeep/anaconda3/lib/python3.9/site-packages/pandas/core/indexing.py", line 1506, in _getitem_tuple
self._has_valid_tuple(tup)
File "/home/sandeep/anaconda3/lib/python3.9/site-packages/pandas/core/indexing.py", line 754, in _has_valid_tuple
self._validate_key(k, i)
File "/home/sandeep/anaconda3/lib/python3.9/site-packages/pandas/core/indexing.py", line 1409, in _validate_key
self._validate_integer(key, axis)
File "/home/sandeep/anaconda3/lib/python3.9/site-packages/pandas/core/indexing.py", line 1500, in _validate_integer
raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds

pretrain

我十分感兴趣您在论文中提到的ssl 方法,您有计划发布预训练代码吗
Encoder Pretrainig 的细节可以讲一下吗

about Iteration Click

Thanks!

Your paper mentions about the optimization of the click strategy ,mainly is,distinction the first click and performing iterative clicks.

I think that’s cool idea will definitely help me greatly to improve my accuracy .But I can't find the code for that......Yous dataset is random generate click. Function generate_click_prompt in utils pack also only when the first time is loaded.

Where is the code about iterate generating click promot based on loss ? Thanks again!

image size 512 : RuntimeError: The size of tensor a (32) must match the size of tensor b (64) at non-singleton dimension 2

❱ 115 │ │ │ x = x + self.pos_embed │
│ 116 │ │ │
│ 117 │ │ for blk in self.blocks: │
│ 118 │ │ │ x = blk(x) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: The size of tensor a (32) must match the size of tensor b (64) at non-singleton dimension 2

Out of Memory

what I use is a single A100 (80G), which is far from enough with chunk 96. May I ask what configuration you are training on?
and I found the code of validation is seem not complete.

GPU memory

I‘d like to know the GPU memory cost when setting the '-chunk' to 96 (the 3D depth) and batch size=1. Thanks!

About instance segmentation

Hi, thanks for sharing.
I wonder that with only one point as prompt, can MSA segment 13 organs at the same time?(For example BTCV dataset),i mean maybe MSA can only work on one class of segmentation target,instance segmentation is not possible now?
Regarding the BTCV dataset, do you handle each organ separately by only segmenting one type of organ at a time, rather than segmenting all 13 types of organs at once?

3d dataset

Does someone download the 3d dataset and run it successfully?
I can not find Abdomen/RawData.zip
截屏2023-06-01 下午11 15 45

OutOfMemoryError: CUDA out of memory. Tried to allocate 24.00 GiB (GPU 0; 47.99 GiB total capacity; 38.75 GiB already allocated; 5.85 GiB free; 39.53 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Please somebody explain how to reduce the batch size for 2D SAM adapter. I could not find the options to train with multiple GPUs and the code does not seem to have an options to lower batch and epoch size.

About Total score when evaluate

I found that during the test, the total score will be output. From an intuitive feeling, the larger the score, the better, but in your code logic, the tot should be the total loss, and the best checkpoint should correspond to a smaller score. Excuse me. Why do you want to set it up like this?

ValueError: high <= 0

The following error occurs when I use my own dataset for training. Can you please tell me what is causing it?
image

Text prompt

Hi, thanks for sharing. I haven't seen any code for the text prompts. Did I miss something or just you haven't release?

thanks!

Train for AMOS

is it the same train process as isic dataset for amos dataset?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.