Git Product home page Git Product logo

camma-public / selfsupsurg Goto Github PK

View Code? Open in Web Editor NEW
31.0 7.0 8.0 43.73 MB

Official repository for "Dissecting Self-Supervised Learning Methods for Surgical Computer Vision"

License: Other

Python 100.00%
self-supervised-learning semi-supervised-learning surgical-data-science surgical-phase-recognition transfer-learning surgical-scene-segmentation deep-learning endoscopic-vision laparascopic-cholecystectomy surgical-computer-vision

selfsupsurg's Introduction

Dissecting Self-Supervised Learning Methods for Surgical Computer Vision

Sanat Ramesh, Vinkle Srivastav, Deepak Alapatt, Tong Yu, Aditya Murali, Luca Sestini, Chinedu Innocent Nwoye, Idris Hamoud, Saurav Sharma, Antoine Fleurentin, Georgios Exarchakis, Alexandros Karargyris, Nicolas Padoy, 2022

arXiv

PWC PWC PWC PWC PWC PWC

News

  • [ 05/06/2023 ]: Added training and evaluation scripts for surgical triplet recognition. Follow readme_triplet

Introduction

The field of surgical computer vision has undergone considerable breakthroughs in recent years with the rising popularity of deep neural network-based methods. However, standard fully-supervised approaches for training such models require vast amounts of annotated data, imposing a prohibitively high cost; especially in the clinical domain. Self-Supervised Learning (SSL) methods, which have begun to gain traction in the general computer vision community, represent a potential solution to these annotation costs, allowing to learn useful representations from only unlabeled data. Still, the effectiveness of SSL methods in more complex and impactful domains, such as medicine and surgery, remains limited and unexplored. In this work, we address this critical need by investigating four state-of-the-art SSL methods (MoCo v2, SimCLR, DINO, SwAV) in the context of surgical computer vision. We present an extensive analysis of the performance of these methods on the Cholec80 dataset for two fundamental and popular tasks in surgical context understanding, phase recognition and tool presence detection. We examine their parameterization, then their behavior with respect to training data quantities in semi-supervised settings. Correct transfer of these methods to surgery, as described and conducted in this work, leads to substantial performance gains over generic uses of SSL - up to 7.4% on phase recognition and 20% on tool presence detection - as well as state-of-the-art semi-supervised phase recognition approaches by up to 14%. Further results obtained on a highly diverse selection of surgical datasets exhibit strong generalization properties.

Main takeaways from the paper

[1] Benchmarking of four state-of-the-art SSL methods ( MoCo v2, SimCLR, SwAV, and DINO) in the surgical domain.

[2] Thorough experimentation (∼200 experiments, 7000 GPU hours) and analysis of different design settings - data augmentations, batch size, training duration, frame rate, and initialization - highlighting a need for and intuitions towards designing principled approaches for domain transfer of SSL methods.

[3] In-depth analysis on the adaptation of these methods, originally developed using other datasets and tasks, to the surgical domain with a comprehensive set of evaluation protocols, spanning 10 surgical vision tasks in total performed on 6 datasets: Cholec80, CholecT50, HeiChole, Endoscapes, CATARACTS, and CaDIS.

[4] Extensive evaluation (∼280 experiments, 2000 GPU hours) of the scalability of these methods to various amounts of labeled and unlabeled data through an exploration of both fully and semi-supervised settings.

In this repo we provide:

  • Self-supervised weights trained on cholec80 dataset using four state-of-the-art SSL methods (MOCO V2, SimCLR, SwAV, and DINO).
  • Self-supervised pre-training scripts.
  • Downstream fine-tuning scripts for surgical phase recognition (linear fine-tuning and TCN fine-tuning).
  • Downstream fine-tuning scripts for surgical tool recognition (linear fine-tuning).
  • Downstream fine-tuning scripts for surgical triplet recognition (linear fine-tuning).

Get Started

Datasets and imagenet checkpoints

Follow the steps for cholec80 dataset preparation and setting up imagenet checkpoints:

# 1. Cholec80 phase and tool labels for different splits
> git clone https://github.com/CAMMA-public/SelfSupSurg
> SelfSupSurg=$(pwd)/SelfSupSurg
> cd $SelfSupSurg/datasets/cholec80
> wget https://s3.unistra.fr/camma_public/github/selfsupsurg/ch80_labels.zip
> unzip -q ch80_labels.zip && rm ch80_labels.zip
# 2. Cholec80 frames:  
# a) Download cholec80 dataset: 
#      - Fill this google form: https://docs.google.com/forms/d/1GwZFM3-GhEduBs1d5QzbfFksKmS1OqXZAz8keYi-wKI  
#       (the link is also available on the CAMMA website: http://camma.u-strasbg.fr/datasets)
# b) Copy the videos in datasets/cholec80/videos 
# Extract frames using the following script (you need OpenCV and numpy)
> cd $SelfSupSurg
> python utils/extract_frames_ch80.py
# 3. Download Imagenet fully supervised and self-supervised weights
> cd $SelfSupSurg/checkpoints/defaults/resnet_50
> wget https://s3.unistra.fr/camma_public/github/selfsupsurg/imagenet_ckpts.zip
> unzip -q imagenet_ckpts.zip && rm imagenet_ckpts.zip
  • Directory structure should look as follows.
$SelSupSurg/
└── datasets/cholec80/
    ├── frames/
        ├── train/
            └── video01/
            └── video02/
            ...
        ├── val/
            └── video41/
            └── video42/
            ...
        ├── test/
            └── video49/
            └── video50/
            ...
    ├── labels/
        ├── train/
            └── 1fps_12p5_0.pickle
            └── 1fps_12p5_1.pickle
            ...
        ├── val/
            └── 1fps.pickle
            └── 3fps.pickle
            ...
        ├── test/
            └── 1fps.pickle
            └── 3fps.pickle
            ...        
    └── classweights/
        ├── train/
            └── 1fps_12p5_0.pickle
            └── 1fps_12p5_1.pickle
                ...
    ...
    └── checkpoints/defaults/resnet_50/
        └── resnet50-19c8e357.pth
        └── moco_v2_800ep_pretrain.pth.tar
        └── simclr_rn50_800ep_simclr_8node_resnet_16_07_20.7e8feed1.torch
        └── swav_in1k_rn50_800ep_swav_8node_resnet_27_07_20.a0a6b676.torch
        └── dino_resnet50_pretrain.pth

Installation

You need to have a Anaconda3 installed for the setup. We developed the code on the Ubuntu 20.04, Python 3.8, PyTorch 1.7.1, and CUDA 10.2 using V100 GPU.

> cd $SelfSupSurg
> conda create -n selfsupsurg python=3.8 && conda activate selfsupsurg
# install dependencies 
(selfsupsurg)>conda install -y pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=10.2 -c pytorch 
(selfsupsurg)>pip install opencv-python
(selfsupsurg)>pip install openpyxl==3.0.7
(selfsupsurg)>pip install pandas==1.3.2
(selfsupsurg)>pip install scikit-learn
(selfsupsurg)>pip install easydict
(selfsupsurg)>pip install apex -f https://dl.fbaipublicfiles.com/vissl/packaging/apexwheels/py38_cu102_pyt171/download.html
(selfsupsurg)>cd $SelfSupSurg/ext_libs
(selfsupsurg)>git clone https://github.com/facebookresearch/ClassyVision.git && cd ClassyVision
(selfsupsurg)>git checkout 659d7f788c941a8c0d08dd74e198b66bd8afa7f5 && pip install -e .
(selfsupsurg)>cd ../ && git clone --recursive https://github.com/facebookresearch/vissl.git && cd ./vissl/
(selfsupsurg)>git checkout 65f2c8d0efdd675c68a0dfb110aef87b7bb27a2b
(selfsupsurg)>pip install --progress-bar off -r requirements.txt
(selfsupsurg)>pip install -e .[dev] && cd $SelfSupSurg
(selfsupsurg)>cp -r ./vissl/vissl/* $SelfSupSurg/ext_libs/vissl/vissl/

Modify $SelfSupSurg/ext_libs/vissl/configs/config/dataset_catalog.json by appending the following key/value pair to the end of the dictionary

"surgery_datasets": {
    "train": ["<img_path>", "<lbl_path>"],
    "val": ["<img_path>", "<lbl_path>"],
    "test": ["<img_path>", "<lbl_path>"]
}

Pre-training

Run the folllowing code for the pre-training of MoCo v2, SimCLR, SwAV, and DINO methods on the Cholec80 dataset with 4 GPUS.

# MoCo v2
(selfsupsurg)>cfg=hparams/cholec80/pre_training/cholec_to_cholec/series_01/h001.yaml
(selfsupsurg)>python main.py -hp $cfg -m self_supervised
# SimCLR
(selfsupsurg)>cfg=hparams/cholec80/pre_training/cholec_to_cholec/series_01/h002.yaml
(selfsupsurg)>python main.py -hp $cfg -m self_supervised
# SwAV
(selfsupsurg)>cfg=hparams/cholec80/pre_training/cholec_to_cholec/series_01/h003.yaml
(selfsupsurg)>python main.py -hp $cfg -m self_supervised
# DINO 
(selfsupsurg)>cfg=hparams/cholec80/pre_training/cholec_to_cholec/series_01/h004.yaml
(selfsupsurg)>python main.py -hp $cfg -m self_supervised

Model Weights for the pre-training experiments

Model Model Weights
MoCo V2 download
SimCLR download
SwAV download
DINO download

Downstream finetuning

First perform pre-training using the above scripts or download the pre-trained weights and copy them into the appropriate directories, shown below

# MoCo v2
(selfsupsurg)>mkdir -p runs/cholec80/pre_training/cholec_to_cholec/series_01/run_001/ \
               && cp model_final_checkpoint_moco_v2_surg.torch runs/cholec80/pre_training/cholec_to_cholec/series_01/run_001/
# SimCLR
(selfsupsurg)>mkdir -p runs/cholec80/pre_training/cholec_to_cholec/series_01/run_002/ \
               && cp model_final_checkpoint_simclr_surg.torch runs/cholec80/pre_training/cholec_to_cholec/series_01/run_002/
# SwAV
(selfsupsurg)>mkdir -p runs/cholec80/pre_training/cholec_to_cholec/series_01/run_003/ \
               && cp model_final_checkpoint_swav_surg.torch runs/cholec80/pre_training/cholec_to_cholec/series_01/run_003/
# DINO 
(selfsupsurg)>mkdir -p runs/cholec80/pre_training/cholec_to_cholec/series_01/run_004/ \
               && cp model_final_checkpoint_dino_surg.torch runs/cholec80/pre_training/cholec_to_cholec/series_01/run_004/

1. Surgical phase recognition (Linear Finetuning)

The config files for the surgical phase recognition linear finetuning experiments are in cholec80 pre-training init and imagenet init. The config files are organized as follows:

config_files
# config files for the proposed pre-training init from cholec80 are oraganized as follows:
├── cholec_to_cholec/series_01/test/phase
    ├── 100 #(100 % of cholec 80)
    │   └── 0 #(split 0)
    │       ├── h001.yaml # MoCo V2 Surg
    │       ├── h002.yaml # SimCLR Surg
    │       ├── h003.yaml # SwAV Surg
    │       └── h004.yaml # DINO Surg
    ├── 12.5 #(12.5 % of cholec 80 dataset)
    │   ├── 0 #(split 0)
    │   │   ├── h001.yaml # MoCo V2 Surg
    │   │   ├── h002.yaml # SimCLR Surg
    │   │   ├── h003.yaml # SwAV Surg
    │   │   └── h004.yaml # DINO Surg
    │   ├── 1 #(split 1)
    │   │   ├── h001.yaml # MoCo V2 Surg
    │   │   ├── h002.yaml # SimCLR Surg
    │   │   ├── h003.yaml # SwAV Surg
    │   │   └── h004.yaml # DINO Surg
    │   ├── 2 #(split 2)
    │   │   ├── h001.yaml # MoCo V2 Surg
    │   │   ├── h002.yaml # SimCLR Surg
    │   │   ├── h003.yaml # SwAV Surg
    │   │   └── h004.yaml # DINO Surg
    └── 25 #(25 % of cholec 80 dataset)
        ├── 0 #(split 0)
        │   ├── h001.yaml # MoCo V2 Surg
        │   ├── h002.yaml # SimCLR Surg
        │   ├── h003.yaml # SwAV Surg
        │   └── h004.yaml # DINO Surg
        ├── 1 #(split 1)
        │   ├── h001.yaml # MoCo V2 Surg
        │   ├── h002.yaml # SimCLR Surg
        │   ├── h003.yaml # SwAV Surg
        │   └── h004.yaml # DINO Surg
        ├── 2 #(split 2)
        │   ├── h001.yaml # MoCo V2 Surg
        │   ├── h002.yaml # SimCLR Surg
        │   ├── h003.yaml # SwAV Surg
        │   └── h004.yaml # DINO Surg
# config files for the baselines imagenet to cholec80 are oraganized as follows:
├── imagenet_to_cholec/series_01/test/phase
    ├── 100 #(100 % of cholec 80)
    │   └── 0 #(split 0)
    │       ├── h001.yaml # Fully-supervised imagenet
    │       ├── h002.yaml # MoCo V2 imagenet
    │       ├── h003.yaml # SimCLR imagenet
    │       ├── h004.yaml # SwAV imagenet
    │       └── h005.yaml # DINO imagenet
    ├── 12.5 #(12.5 % of cholec 80 dataset)
    │   ├── 0 #(split 0)
    │   │   ├── h001.yaml # Fully-supervised imagenet
    │   │   ├── h002.yaml # MoCo V2 imagenet
    │   │   ├── h003.yaml # SimCLR  imagenet
    │   │   ├── h004.yaml # SwAV  imagenet
    │   │   └── h005.yaml # DINO imagenet
    │   ├── 1 #(split 1)
    │   │   ├── h001.yaml # Fully-supervised imagenet 
    │   │   ├── h002.yaml # MoCo V2 imagenet
    │   │   ├── h003.yaml # SimCLR imagenet
    │   │   ├── h004.yaml # SwAV imagenet
    │   │   └── h005.yaml # DINO imagenet
    │   ├── 2 #(split 2)
    │   │   ├── h001.yaml # Fully-supervised imagenet 
    │   │   ├── h002.yaml # MoCo V2 imagenet
    │   │   ├── h003.yaml # SimCLR imagenet
    │   │   ├── h004.yaml # SwAV imagenet
    │   │   └── h005.yaml # DINO imagenet
    └── 25 #(25 % of cholec 80 dataset)
        ├── 0 #(split 0)
        │   ├── h001.yaml # Fully-supervised imagenet
        │   ├── h002.yaml # MoCo V2 imagenet
        │   ├── h003.yaml # SimCLR imagenet
        │   ├── h004.yaml # SwAV imagenet
        │   └── h005.yaml # DINO imagenet
        ├── 1 #(split 1)
        │   ├── h001.yaml # Fully-supervised imagenet
        │   ├── h002.yaml # MoCo V2 imagenet
        │   ├── h003.yaml # SimCLR imagenet
        │   ├── h004.yaml # SwAV imagenet
        │   ├── h005.yaml # DINO imagenet
        ├── 2 #(split 2)
        │   ├── h001.yaml # Fully-supervised imagenet
        │   ├── h002.yaml # MoCo V2 imagenet
        │   ├── h003.yaml # SimCLR imagenet
        │   ├── h004.yaml # SwAV imagenet
        │   └── h005.yaml # DINO imagenet

Examples commands for surgical phase linear fine-tuning. It uses 4 GPUS for the training

# Example 1, run the following command for linear fine-tuning, initialized with MoCO V2 weights 
# on 25% of cholec80 data (split 0).
(selfsupsurg)>cfg=hparams/cholec80/finetuning/cholec_to_cholec/series_01/test/phase/25/0/h001.yaml
(selfsupsurg)>python main.py -hp $cfg -m supervised

# Example 2, run the following command for linear fine-tuning, initialized with SimCLR weights 
# on 12.5% of cholec80 data (split 1).
(selfsupsurg)>cfg=hparams/cholec80/finetuning/cholec_to_cholec/series_01/test/phase/12.5/1/h002.yaml
(selfsupsurg)>python main.py -hp $cfg -m supervised

# Example 3, run the following command for linear fine-tuning, initialized with 
# imagenet MoCO v2 weights on 12.5% of cholec80 data (split 2).
(selfsupsurg)>cfg=hparams/cholec80/finetuning/imagenet_to_cholec/series_01/test/phase/12.5/2/h002.yaml
(selfsupsurg)>python main.py -hp $cfg -m supervised

2. Surgical phase recognition (TCN Finetuning)

The config files for the surgical phase recognition TCN finetuning experiments are in cholec80 pre-training init and imagenet init. The config files are organized as follows:

config_files
# config files for the proposed pre-training init from cholec80 are oraganized as follows:
├── cholec_to_cholec/series_01/test/phase_tcn
    ├── 100 #(100 % of cholec 80)
    │   └── 0 #(split 0)
    │       ├── h001.yaml # MoCo V2 Surg
    │       ├── h002.yaml # SimCLR Surg
    │       ├── h003.yaml # SwAV Surg
    │       └── h004.yaml # DINO Surg
    ├── 12.5 #(12.5 % of cholec 80 dataset)
    │   ├── 0 #(split 0)
    │   │   ├── h001.yaml # MoCo V2 Surg
    │   │   ├── h002.yaml # SimCLR Surg
    │   │   ├── h003.yaml # SwAV Surg
    │   │   └── h004.yaml # DINO Surg
    │   ├── 1 #(split 1)
    │   │   ├── h001.yaml # MoCo V2 Surg
    │   │   ├── h002.yaml # SimCLR Surg
    │   │   ├── h003.yaml # SwAV Surg
    │   │   └── h004.yaml # DINO Surg
    │   ├── 2 #(split 2)
    │   │   ├── h001.yaml # MoCo V2 Surg
    │   │   ├── h002.yaml # SimCLR Surg
    │   │   ├── h003.yaml # SwAV Surg
    │   │   └── h004.yaml # DINO Surg
    └── 25 #(25 % of cholec 80 dataset)
        ├── 0 #(split 0)
        │   ├── h001.yaml # MoCo V2 Surg
        │   ├── h002.yaml # SimCLR Surg
        │   ├── h003.yaml # SwAV Surg
        │   └── h004.yaml # DINO Surg
        ├── 1 #(split 1)
        │   ├── h001.yaml # MoCo V2 Surg
        │   ├── h002.yaml # SimCLR Surg
        │   ├── h003.yaml # SwAV Surg
        │   └── h004.yaml # DINO Surg
        ├── 2 #(split 2)
        │   ├── h001.yaml # MoCo V2 Surg
        │   ├── h002.yaml # SimCLR Surg
        │   ├── h003.yaml # SwAV Surg
        │   └── h004.yaml # DINO Surg
# config files for the baselines imagenet to cholec80 are oraganized as follows:
├── imagenet_to_cholec/series_01/test/phase_tcn
    ├── 100 #(100 % of cholec 80)
    │   └── 0 #(split 0)
    │       ├── h001.yaml # Fully-supervised imagenet
    │       ├── h002.yaml # MoCo V2 imagenet
    │       ├── h003.yaml # SimCLR imagenet
    │       ├── h004.yaml # SwAV imagenet
    │       └── h005.yaml # DINO imagenet
    ├── 12.5 #(12.5 % of cholec 80 dataset)
    │   ├── 0 #(split 0)
    │   │   ├── h001.yaml # Fully-supervised imagenet
    │   │   ├── h002.yaml # MoCo V2 imagenet
    │   │   ├── h003.yaml # SimCLR  imagenet
    │   │   ├── h004.yaml # SwAV  imagenet
    │   │   └── h005.yaml # DINO imagenet
    │   ├── 1 #(split 1)
    │   │   ├── h001.yaml # Fully-supervised imagenet 
    │   │   ├── h002.yaml # MoCo V2 imagenet
    │   │   ├── h003.yaml # SimCLR imagenet
    │   │   ├── h004.yaml # SwAV imagenet
    │   │   └── h005.yaml # DINO imagenet
    │   ├── 2 #(split 2)
    │   │   ├── h001.yaml # Fully-supervised imagenet 
    │   │   ├── h002.yaml # MoCo V2 imagenet
    │   │   ├── h003.yaml # SimCLR imagenet
    │   │   ├── h004.yaml # SwAV imagenet
    │   │   └── h005.yaml # DINO imagenet
    └── 25 #(25 % of cholec 80 dataset)
        ├── 0 #(split 0)
        │   ├── h001.yaml # Fully-supervised imagenet
        │   ├── h002.yaml # MoCo V2 imagenet
        │   ├── h003.yaml # SimCLR imagenet
        │   ├── h004.yaml # SwAV imagenet
        │   └── h005.yaml # DINO imagenet
        ├── 1 #(split 1)
        │   ├── h001.yaml # Fully-supervised imagenet
        │   ├── h002.yaml # MoCo V2 imagenet
        │   ├── h003.yaml # SimCLR imagenet
        │   ├── h004.yaml # SwAV imagenet
        │   ├── h005.yaml # DINO imagenet
        ├── 2 #(split 2)
        │   ├── h001.yaml # Fully-supervised imagenet
        │   ├── h002.yaml # MoCo V2 imagenet
        │   ├── h003.yaml # SimCLR imagenet
        │   ├── h004.yaml # SwAV imagenet
        │   └── h005.yaml # DINO imagenet

Examples commands for TCN fine-tuning. We first extract the features for the train, val and test set and then perform the TCN fine-tuning

# Example 1, run the following command for TCN fine-tuning, initialized with MoCO V2 weights 
# on 25% of cholec80 data (split 0).
# 1) feature extraction for the train, val and test set
(selfsupsurg)>cfg=hparams/cholec80/finetuning/cholec_to_cholec/series_01/test/phase/25/0/h001.yaml
(selfsupsurg)>python main.py -hp $cfg -m  feature_extraction -s train -f Trunk
(selfsupsurg)>python main.py -hp $cfg -m  feature_extraction -s val -f Trunk
(selfsupsurg)>python main.py -hp $cfg -m  feature_extraction -s test -f Trunk                            
# 2) TCN fine-tuning        
(selfsupsurg)>cfg=hparams/cholec80/finetuning/cholec_to_cholec/series_01/test/phase_tcn/25/0/h001.yaml
(selfsupsurg)>python main_ft_phase_tcn.py -hp $cfg -t test

# Example 2, run the following command for TCN fine-tuning, initialized with SimCLR weights 
# on 12.5% of cholec80 data (split 1).
# 1) feature extraction for the train, val and test set
(selfsupsurg)>cfg=hparams/cholec80/finetuning/cholec_to_cholec/series_01/test/phase/12.5/1/h002.yaml
(selfsupsurg)>python main.py -hp $cfg -m  feature_extraction -s train -f Trunk
(selfsupsurg)>python main.py -hp $cfg -m  feature_extraction -s val -f Trunk
(selfsupsurg)>python main.py -hp $cfg -m  feature_extraction -s test -f Trunk                            
# 2) TCN fine-tuning        
(selfsupsurg)>cfg=hparams/cholec80/finetuning/cholec_to_cholec/series_01/test/phase_tcn/12.5/1/h002.yaml
(selfsupsurg)>python main_ft_phase_tcn.py -hp $cfg -t test

# Example 3, run the following command for TCN fine-tuning, initialized with imagenet MoCO v2 weights 
# on 12.5% of cholec80 data (split 2).
# 1) feature extraction for the train, val and test set
(selfsupsurg)>cfg=hparams/cholec80/finetuning/imagenet_to_cholec/series_01/test/phase/12.5/2/h002.yaml
(selfsupsurg)>python main.py -hp $cfg -m  feature_extraction -s train -f Trunk
(selfsupsurg)>python main.py -hp $cfg -m  feature_extraction -s val -f Trunk
(selfsupsurg)>python main.py -hp $cfg -m  feature_extraction -s test -f Trunk                            
# 2) TCN fine-tuning        
(selfsupsurg)>cfg=hparams/cholec80/finetuning/imagenet_to_cholec/series_01/test/phase_tcn/12.5/2/h002.yaml
(selfsupsurg)>python main_ft_phase_tcn.py -hp $cfg -t test

3. Surgical tool recognition

The config files for the surgical tool recognition experiments are in cholec80 pre-training init and imagenet init. The config files are organized as follows:

config_files
# config files for the proposed pre-training init from cholec80 are oraganized as follows:
├── cholec_to_cholec/series_01/test/tools
    ├── 100 #(100 % of cholec 80)
    │   └── 0 #(split 0)
    │       ├── h001.yaml # MoCo V2 Surg
    │       ├── h002.yaml # SimCLR Surg
    │       ├── h003.yaml # SwAV Surg
    │       └── h004.yaml # DINO Surg
    ├── 12.5 #(12.5 % of cholec 80 dataset)
    │   ├── 0 #(split 0)
    │   │   ├── h001.yaml # MoCo V2 Surg
    │   │   ├── h002.yaml # SimCLR Surg
    │   │   ├── h003.yaml # SwAV Surg
    │   │   └── h004.yaml # DINO Surg
    │   ├── 1 #(split 1)
    │   │   ├── h001.yaml # MoCo V2 Surg
    │   │   ├── h002.yaml # SimCLR Surg
    │   │   ├── h003.yaml # SwAV Surg
    │   │   └── h004.yaml # DINO Surg
    │   ├── 2 #(split 2)
    │   │   ├── h001.yaml # MoCo V2 Surg
    │   │   ├── h002.yaml # SimCLR Surg
    │   │   ├── h003.yaml # SwAV Surg
    │   │   └── h004.yaml # DINO Surg
    └── 25 #(25 % of cholec 80 dataset)
        ├── 0 #(split 0)
        │   ├── h001.yaml # MoCo V2 Surg
        │   ├── h002.yaml # SimCLR Surg
        │   ├── h003.yaml # SwAV Surg
        │   └── h004.yaml # DINO Surg
        ├── 1 #(split 1)
        │   ├── h001.yaml # MoCo V2 Surg
        │   ├── h002.yaml # SimCLR Surg
        │   ├── h003.yaml # SwAV Surg
        │   └── h004.yaml # DINO Surg
        ├── 2 #(split 2)
        │   ├── h001.yaml # MoCo V2 Surg
        │   ├── h002.yaml # SimCLR Surg
        │   ├── h003.yaml # SwAV Surg
        │   └── h004.yaml # DINO Surg
# config files for the baselines imagenet to cholec80 are oraganized as follows:
├── imagenet_to_cholec/series_01/test/tools
    ├── 100 #(100 % of cholec 80)
    │   └── 0 #(split 0)
    │       ├── h001.yaml # Fully-supervised imagenet
    │       ├── h002.yaml # MoCo V2 imagenet
    │       ├── h003.yaml # SimCLR imagenet
    │       ├── h004.yaml # SwAV imagenet
    │       └── h005.yaml # DINO imagenet
    ├── 12.5 #(12.5 % of cholec 80 dataset)
    │   ├── 0 #(split 0)
    │   │   ├── h001.yaml # Fully-supervised imagenet
    │   │   ├── h002.yaml # MoCo V2 imagenet
    │   │   ├── h003.yaml # SimCLR  imagenet
    │   │   ├── h004.yaml # SwAV  imagenet
    │   │   └── h005.yaml # DINO imagenet
    │   ├── 1 #(split 1)
    │   │   ├── h001.yaml # Fully-supervised imagenet 
    │   │   ├── h002.yaml # MoCo V2 imagenet
    │   │   ├── h003.yaml # SimCLR imagenet
    │   │   ├── h004.yaml # SwAV imagenet
    │   │   └── h005.yaml # DINO imagenet
    │   ├── 2 #(split 2)
    │   │   ├── h001.yaml # Fully-supervised imagenet 
    │   │   ├── h002.yaml # MoCo V2 imagenet
    │   │   ├── h003.yaml # SimCLR imagenet
    │   │   ├── h004.yaml # SwAV imagenet
    │   │   └── h005.yaml # DINO imagenet
    └── 25 #(25 % of cholec 80 dataset)
        ├── 0 #(split 0)
        │   ├── h001.yaml # Fully-supervised imagenet
        │   ├── h002.yaml # MoCo V2 imagenet
        │   ├── h003.yaml # SimCLR imagenet
        │   ├── h004.yaml # SwAV imagenet
        │   └── h005.yaml # DINO imagenet
        ├── 1 #(split 1)
        │   ├── h001.yaml # Fully-supervised imagenet
        │   ├── h002.yaml # MoCo V2 imagenet
        │   ├── h003.yaml # SimCLR imagenet
        │   ├── h004.yaml # SwAV imagenet
        │   ├── h005.yaml # DINO imagenet
        ├── 2 #(split 2)
        │   ├── h001.yaml # Fully-supervised imagenet
        │   ├── h002.yaml # MoCo V2 imagenet
        │   ├── h003.yaml # SimCLR imagenet
        │   ├── h004.yaml # SwAV imagenet
        │   └── h005.yaml # DINO imagenet

Examples commands for surgical tool recognition linear fine-tuning. It uses 4 GPUS for the training

# Example 1, run the following command for linear fine-tuning, initialized with MoCO V2 weights 
# on 25% of cholec80 data (split 0).
(selfsupsurg)>cfg=hparams/cholec80/finetuning/cholec_to_cholec/series_01/test/tools/25/0/h001.yaml
(selfsupsurg)>python main.py -hp $cfg -m supervised

# Example 2, run the following command for linear fine-tuning, initialized with SimCLR weights 
# on 12.5% of cholec80 data (split 1).
(selfsupsurg)>cfg=hparams/cholec80/finetuning/cholec_to_cholec/series_01/test/tools/12.5/1/h002.yaml
(selfsupsurg)>python main.py -hp $cfg -m supervised

# Example 3, run the following command for linear fine-tuning, initialized with 
# imagenet MoCO v2 weights on 12.5% of cholec80 data (split 2).
(selfsupsurg)>cfg=hparams/cholec80/finetuning/imagenet_to_cholec/series_01/test/tools/12.5/2/h002.yaml
(selfsupsurg)>python main.py -hp $cfg -m supervised

4. Evaluation

Example command to evaluate all the experiments and collect the results

# computes evaluation metrics for all the experiments and saves results in the runs/metrics_<phase/tool>.csv
(selfsupsurg)>python utils/generate_test_results.py

Citation

@article{ramesh2023dissecting,
  title={Dissecting self-supervised learning methods for surgical computer vision},
  author={Ramesh, Sanat and Srivastav, Vinkle and Alapatt, Deepak and Yu, Tong and Murali, Aditya and Sestini, Luca and Nwoye, Chinedu Innocent and Hamoud, Idris and Sharma, Saurav and Fleurentin, Antoine and others},
  journal={Medical Image Analysis},
  pages={102844},
  year={2023},
  publisher={Elsevier}
}

References

The project uses VISSL. We thank the authors of VISSL for releasing the library. If you use VISSL, consider citing it using the following BibTeX entry.

@misc{goyal2021vissl,
  author =       {Priya Goyal and Quentin Duval and Jeremy Reizenstein and Matthew Leavitt and Min Xu and
                  Benjamin Lefaudeux and Mannat Singh and Vinicius Reis and Mathilde Caron and Piotr Bojanowski and
                  Armand Joulin and Ishan Misra},
  title =        {VISSL},
  howpublished = {\url{https://github.com/facebookresearch/vissl}},
  year =         {2021}
}

The project also leverages following research works. We thank the authors for releasing their codes.

License

This code, models, and datasets are available for non-commercial scientific research purposes as defined in the CC BY-NC-SA 4.0. By downloading and using this code you agree to the terms in the LICENSE. Third-party codes are subject to their respective licenses.

selfsupsurg's People

Contributors

srv902 avatar vinkle avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

selfsupsurg's Issues

No module named 'downstream_phase_tcn.linear_evaluation'

Greetings. When I try your codes, the part of
# 2) TCN fine-tuning (selfsupsurg)>cfg=hparams/cholec80/finetuning/cholec_to_cholec/series_01/test/phase_tcn/12.5/1/h002.yaml (selfsupsurg)>python main_ft_phase_tcn.py -hp $cfg -t test
There is an error showing that ModuleNotFoundError: No module named 'downstream_phase_tcn.linear_evaluation'
And after checking the trainers.py line18 from downstream_phase_tcn.linear_evaluation import FCN, it seems that there is no such a file (linear_evaluation) in the folder downstream_phase_tcn
Could you please help me to solve this problem?

RuntimeError: No rendezvous handler for tcp://

Hello, I want to run this code on a Windows system, the virtual environment is configured according to the standard you gave. Since I only have one GPU available, I set it to 1 in the config file and subsequently ran "python main.py -hp hparams\cholec80\pre_training\cholec_to_cholec\series_01\h001.yaml -m self_supervised", the following error occurs:

......
--- Logging error ---
Traceback (most recent call last):
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\utils\distributed_launcher.py", line 150, in launch_distributed
_distributed_worker(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\utils\distributed_launcher.py", line 192, in _distributed_worker
run_engine(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\engines\engine_registry.py", line 86, in run_engine
engine.run_engine(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\engines\train.py", line 39, in run_engine
train_main(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\engines\train.py", line 127, in train_main
trainer = SelfSupervisionTrainer(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\trainer\trainer_main.py", line 86, in init
self.setup_distributed(self.cfg.MACHINE.DEVICE == "gpu")
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\trainer\trainer_main.py", line 118, in setup_distributed
torch.distributed.init_process_group(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\torch\distributed\distributed_c10d.py", line 433, in init_process_group
rendezvous_iterator = rendezvous(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\torch\distributed\rendezvous.py", line 82, in rendezvous
raise RuntimeError("No rendezvous handler for {}://".format(result.scheme))
RuntimeError: No rendezvous handler for tcp://

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:\anaconda\envs\selfsupsurg\lib\logging_init_.py", line 1085, in emit
msg = self.format(record)
File "D:\anaconda\envs\selfsupsurg\lib\logging_init_.py", line 929, in format
return fmt.format(record)
File "D:\anaconda\envs\selfsupsurg\lib\logging_init_.py", line 668, in format
record.message = record.getMessage()
File "D:\anaconda\envs\selfsupsurg\lib\logging_init_.py", line 373, in getMessage
msg = msg % self.args
TypeError: not all arguments converted during string formatting
Call stack:
File "main.py", line 97, in
hydra_main(overrides=overrides, mode=training_mode)
File "main.py", line 59, in hydra_main
launch_distributed(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\utils\distributed_launcher.py", line 162, in launch_distributed
logging.error("Wrapping up, caught exception: ", e)
Message: 'Wrapping up, caught exception: '
Arguments: (RuntimeError('No rendezvous handler for tcp://'),)
--- Logging error ---
Traceback (most recent call last):
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\utils\distributed_launcher.py", line 150, in launch_distributed
_distributed_worker(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\utils\distributed_launcher.py", line 192, in _distributed_worker
run_engine(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\engines\engine_registry.py", line 86, in run_engine
engine.run_engine(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\engines\train.py", line 39, in run_engine
train_main(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\engines\train.py", line 127, in train_main
trainer = SelfSupervisionTrainer(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\trainer\trainer_main.py", line 86, in init
self.setup_distributed(self.cfg.MACHINE.DEVICE == "gpu")
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\trainer\trainer_main.py", line 118, in setup_distributed
torch.distributed.init_process_group(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\torch\distributed\distributed_c10d.py", line 433, in init_process_group
rendezvous_iterator = rendezvous(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\torch\distributed\rendezvous.py", line 82, in rendezvous
raise RuntimeError("No rendezvous handler for {}://".format(result.scheme))
RuntimeError: No rendezvous handler for tcp://

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:\anaconda\envs\selfsupsurg\lib\logging_init_.py", line 1085, in emit
msg = self.format(record)
File "D:\anaconda\envs\selfsupsurg\lib\logging_init_.py", line 929, in format
return fmt.format(record)
File "D:\anaconda\envs\selfsupsurg\lib\logging_init_.py", line 668, in format
record.message = record.getMessage()
File "D:\anaconda\envs\selfsupsurg\lib\logging_init_.py", line 373, in getMessage
msg = msg % self.args
TypeError: not all arguments converted during string formatting
Call stack:
File "main.py", line 97, in
hydra_main(overrides=overrides, mode=training_mode)
File "main.py", line 59, in hydra_main
launch_distributed(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\utils\distributed_launcher.py", line 162, in launch_distributed
logging.error("Wrapping up, caught exception: ", e)
Message: 'Wrapping up, caught exception: '
Arguments: (RuntimeError('No rendezvous handler for tcp://'),)
Traceback (most recent call last):
File "main.py", line 97, in
hydra_main(overrides=overrides, mode=training_mode)
File "main.py", line 59, in hydra_main
launch_distributed(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\utils\distributed_launcher.py", line 164, in launch_distributed
raise e
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\utils\distributed_launcher.py", line 150, in launch_distributed
_distributed_worker(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\utils\distributed_launcher.py", line 192, in _distributed_worker
run_engine(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\engines\engine_registry.py", line 86, in run_engine
engine.run_engine(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\engines\train.py", line 39, in run_engine
train_main(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\engines\train.py", line 127, in train_main
trainer = SelfSupervisionTrainer(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\trainer\trainer_main.py", line 86, in init
self.setup_distributed(self.cfg.MACHINE.DEVICE == "gpu")
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\vissl\trainer\trainer_main.py", line 118, in setup_distributed
torch.distributed.init_process_group(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\torch\distributed\distributed_c10d.py", line 433, in init_process_group
rendezvous_iterator = rendezvous(
File "D:\anaconda\envs\selfsupsurg\lib\site-packages\torch\distributed\rendezvous.py", line 82, in rendezvous
raise RuntimeError("No rendezvous handler for {}://".format(result.scheme))
RuntimeError: No rendezvous handler for tcp://

Looking forward to your reply so that I can reproduce the algorithm, thank you!

Endoscapes dataset

Hi, thanks for sharing.
Have you made your dataset publicly available? Or am I just not finding it?

How to reproduce the results in Table 6

Thanks for the paper and codes, I have some questions about the code. I follow the Readme to run the codes, but it only generates the results in Table 4. What should I do if I want to reproduce the results in Table 6?

Besides

save_dir = './frames/train/' + video_name.strip('.mp4') +'/'

will generate incorrect paths. for example, It changes "video34.mp4" into "video3".

Problems encountered during testing

Hello, I would like to test the surgical tool recognition using the trained weights first, but I have encountered the following issue:

Code running instructions:
(selfsupsurg)>cfg=hparams/cholec80/finetuning/cholec_to_cholec/series_01/test/tools/25/0/h001.yaml
(selfsupsurg)>python main.py -hp $cfg -m supervised

error:
Traceback (most recent call last):
File "main.py", line 99, in
hydra_main(overrides=overrides, mode=training_mode)
File "main.py", line 61, in hydra_main
launch_distributed(
File "/home/liu/disk2/lxj/SelfSupSurg/SelfSupSurg-main/ext_libs/vissl/vissl/utils/distributed_launcher.py", line 152, in launch_distributed
_distributed_worker(
File "/home/liu/disk2/lxj/SelfSupSurg/SelfSupSurg-main/ext_libs/vissl/vissl/utils/distributed_launcher.py", line 194, in _distributed_worker
run_engine(
File "/home/liu/disk2/lxj/SelfSupSurg/SelfSupSurg-main/ext_libs/vissl/vissl/engines/engine_registry.py", line 86, in run_engine
engine.run_engine(
File "/home/liu/disk2/lxj/SelfSupSurg/SelfSupSurg-main/ext_libs/vissl/vissl/engines/train.py", line 39, in run_engine
train_main(
File "/home/liu/disk2/lxj/SelfSupSurg/SelfSupSurg-main/ext_libs/vissl/vissl/engines/train.py", line 130, in train_main
trainer.train()
File "/home/liu/disk2/lxj/SelfSupSurg/SelfSupSurg-main/ext_libs/vissl/vissl/trainer/trainer_main.py", line 162, in train
self.task.prepare(pin_memory=self.cfg.DATA.PIN_MEMORY)
File "/home/liu/disk2/lxj/SelfSupSurg/SelfSupSurg-main/ext_libs/vissl/vissl/trainer/train_task.py", line 735, in prepare
self.datasets, self.data_and_label_keys = self.build_datasets(
File "/home/liu/disk2/lxj/SelfSupSurg/SelfSupSurg-main/ext_libs/vissl/vissl/trainer/train_task.py", line 308, in build_datasets
datasets[split.lower()] = build_dataset(
File "/home/liu/disk2/lxj/SelfSupSurg/SelfSupSurg-main/ext_libs/vissl/vissl/data/init.py", line 72, in build_dataset
return GenericSSLDataset(
File "/home/liu/disk2/lxj/SelfSupSurg/SelfSupSurg-main/ext_libs/vissl/vissl/data/ssl_dataset.py", line 105, in init
self._get_data_files(split)
File "/home/liu/disk2/lxj/SelfSupSurg/SelfSupSurg-main/ext_libs/vissl/vissl/data/ssl_dataset.py", line 159, in _get_data_files
self.data_paths, self.label_paths = dataset_catalog.get_data_files(
File "/home/liu/disk2/lxj/SelfSupSurg/SelfSupSurg-main/ext_libs/vissl/vissl/data/dataset_catalog.py", line 262, in get_data_files
assert len(dataset_config[split].DATASET_NAMES) == len(
AssertionError: len(data_sources) != len(dataset_names)

I output these two parameters with “print(dataset_config[split].DATASET_NAMES)
print(dataset_config[split].DATA_SOURCES)”
and the result is:
['imagenet1k_folder']
[]

I was wondering if I wanted to change the data path in h001.yaml to my own, but I don't know what the difference between DATA_PATHS and DATA_SOURCES is? Is DATA_SOURCES the path to write videos for the Cholec80 dataset?
Can you provide detailed information on the points that need to be changed in the configuration file?

I am looking forward to your reply and would greatly appreciate it. ^_^

Difference in metrics between the two arxiv version

Hi, I've noticed there is a difference in the F1 score for phase recognition between the two arxiv versions and wanted to understand the differences. It seems that the linear evaluation protocol haven't changed so how can you explain the differences in tables 3 and 4?

For example for table 3: dino, base, 40 videos -> 71.6 and in the older version the value was 67.0.

Thanks for your great work!

Few errors in the evaluation code of the experiment listed in Table 6

Hi, thanks for your code and paper, I may have found some errors in your evaluation code when agg='video_relaxed'.

First the original matlab script

index = find(prec>100);
prec(index)=100;
index = find(rec>100);
rec(index)=100;

They limit the value of precision and recall.

Besides when you compute the 'tp_and_fp'

tp_and_fp = np.argwhere(preds == iPhase).flatten()
tp_and_fn = np.argwhere(targets == iPhase).flatten()
union = np.union1d(tp_and_fp, tp_and_fn)
# compute tp
tp = np.sum(updatedDiff[tp_and_fp] == 0)

This code will add some 0s into 'tp_and_fp'. For example, if the predictions of frame 5 to frame 8 equals to iPhase, then

tp_and_fp $=[0,5,0,6,0,7,0,8]$

And if updatedDiff[0] $=0$, the value of tp will increase by 4.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.