Git Product home page Git Product logo

mil4wsi's Introduction

Introduction

Welcome to the mil4wsi Framework – your gateway to state-of-the-art Multiple Instance Learning (MIL) model implementations for gigapixel whole slide images. This comprehensive open-source repository empowers researchers, developers, and enthusiasts to explore and leverage cutting-edge MIL techniques.

Installation

conda create -n wsissl python=3.9
conda activate wsissl
conda env update --file environment.yml

Data Preprocessing

This work uses CLAM to filter out background patches. After the .h5 coordinate generation, use:

Available Models

  • MaxPooling
  • MeanPooling
  • ABMIL
  • DSMIL
  • DASMIL
  • BUFFERMIL
  • TRANSMIL
  • HIPT

DASMIL

@inproceedings{Bontempo2023_MICCAI,
    author={Bontempo, Gianpaolo and Porrello, Angelo and Bolelli, Federico and Calderara, Simone and Ficarra, Elisa},
    title={{DAS-MIL: Distilling Across Scales for MIL Classification of Histological WSIs}},
    booktitle={Medical Image Computing and Computer Assisted Intervention – MICCAI 2023},
    pages={248--258},
    year=2023,
    month={Oct},
    publisher={Springer},
    doi={https://doi.org/10.1007/978-3-031-43907-0_24},
    isbn={978-3-031-43906-3}
}


@ARTICLE{Bontempo2024_TMI,
  author={Bontempo, Gianpaolo and Bolelli, Federico and Porrello, Angelo and Calderara, Simone and Ficarra, Elisa},
  journal={IEEE Transactions on Medical Imaging}, 
  title={A Graph-Based Multi-Scale Approach With Knowledge Distillation for WSI Classification}, 
  year={2024},
  volume={43},
  number={4},
  pages={1412-1421},
  keywords={Feature extraction;Proposals;Spatial resolution;Knowledge engineering;Graph neural networks;Transformers;Prediction algorithms;Whole slide images (WSIs);multiple instance learning (MIL);(self) knowledge distillation;weakly supervised learning},
  doi={10.1109/TMI.2023.3337549}}

Training

python main.py --datasetpath DATASETPATH --dataset [cam or lung]

Reproducibility

Pretrained models

DINO Camelyon16 DINO LUNG
x5 ~0.65GB x5 ~0.65GB
x10 ~0.65GB x10 ~0.65GB
x20 ~0.65GB x20 ~0.65GB
DASMIL Camelyon16 DASMIL LUNG
model ~9MB model ~15MB
ACC: 0.945 ACC: 0.92
AUC: 0.967 AUC: 0.966

Pytorch Geometric - Extracted Features

Camelyon16 LUNG
Dataset ~4.25GB Dataset ~17.5GB

Eval

setup checkpoints and datasets paths in utils/experiment.py then

python eval.py --datasetpath DATASETPATH --checkpoint CHECKPOINTPATH --dataset [cam or lung]

Contributing

We encourage and welcome contributions from the community to help improve the MIL Models Framework and make it even more valuable for the entire machine-learning community.

mil4wsi's People

Contributors

bontempogianpaolo1 avatar francescamiccolis avatar prittt avatar wangbo00129 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

mil4wsi's Issues

About TCGA-Lung dataset

Thank you for the excellent work you have posted. I would like to know please what is models.selectModel

PREPROCESSING OF CAMELYON16 DATASET USING CLAM

Hi, i was trying to reproduce the results u have shown with CAMELYON 16 dataset. i have used the configuration u have provided in 0-extract_patches to run with create_pathes_fp.py

seg_level,sthresh,mthresh,close,use_otsu,a_t,a_h,max_n_holes,vis_level,line_thickness,white_thresh,black_thresh,use_padding,contour_fn,keep_ids,exclude_ids
-1,8,7,4,TRUE,25,4,8,-1,100,5,50,TRUE,four_pt,none,none

but i am getting around 1200 patches at level 1 compared to 5771 reported in the paper. and similarly for level 2 i got 400 images compared to 1528 as mentioned in paper. am i doing something wrong? do i need to change he config file?

also i think the CLAM repo is modified so i am getting some error while using convert_h5_to_jpg.py with CLAM. can u look into that and help me?

Question about preprocessing

Hello, I'm trying to preprocess my private dataset. Before that, I had a try on TCGA-78-8662-01Z-00-DX1, which is already processed in lungGraph_13/processed/train/data_1.
image
According to the properties of the svs, I rerun CLAM using 512, 1024 and 2048 patch sizes.
image
However, I found the h5 from patch size 512 is 21281, which differs from your patch number of level 3.
image

Could you help me?

ISSUE WITH 1-sort_images for CAMELYON16 dataset

i am trying to sort the images using sort_hierarchy.py but its showing the following error:

/home/thomas/.conda/envs/wsissl/lib/python3.10/site-packages/submitit/core/core.py:628: UserWarning: Received an empty job array
warnings.warn("Received an empty job array")

  1. does this code require a slide_properties.csv file to be present in the same directory? this seems to be missing. or do we have to create it based on metadata of dataset?
    2.is the an issue related to the format of output patches? i am attaching a screenshot of the output patches format after converting to jpg
    rnd1

Data preprocessing issues

Hello, I encounter the submitit.core.utils.UncompletedJobError: Job not requeued because: timed-out and not checkpointable. error when running the convert_h5_to_jpg.py code. How can I resolve this issue?

Question about reproducing

Thank you for your great work. I'm trying to reproduce the exact results using your pre-generated pt files and failed to reach your accuracy.
This is the script I used.

CUDA_VISIBLE_DEVICES=1 python main.py --datasetpath /home/wangb/projects/20240226_reproduce_das_mil/data/camGraph_23 --
dataset cam

Are there any suggestions about how to improve it? Thank you.
image

data preprocessing/Dino part

First of all thanks for your informative and unique method in your recent paper. I am wondering if you could answer my question.
Thanks in advance. :)

In data preprocessing" section in Extract Dino Features in the "run_with_submitit.py" script what are the "pretrained_weights1, pretrained_weights2, pretrained_weights3" ?
are they the pretrained camelyon 16 and lung? if we want to use different pretrain weghts like ResNet50, how we could use them in this script (run_with_submitit.py)?
Moreover, should we alter the "main_dino.py" based on ResNet as well? (as you mentioned in the codes for dino training)

FROC metric calculation

Hello! Thanks for your great work!

Could you share code for generating an FROC curve? Additionally, I'm interested in understanding whether the FROC calculation involves plotting curves for all bags and then averaging them or directly calculating FROC for individual instances across all bags.

solve virtual environment

Thank you for your work and contribution. Following the installation steps in readme, I use the command "mamba env update -- file environment. yml". there were packages that could not be resolved and did not exist when I was solving virtual environment dependencies. May I ask how I should solve it?
image
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.