3D_image_segmentation_via_deep_CNN

Description:

These models were trained for the kaggle compitition and won a BRONZE medal.
Given a fragment, this segmentation model detects the ink.
The output of the model is a binary mask where the 1 represents the presence of ink and 0 for no ink.
This above mask is then converted into the RLE(Run Length Encodings).

The following example has the letter "E" it's RLE encoding is: "11 7 20 1 23 1 26 1 29 1 32 1 35 1 38 1 44 1"

Dataset:

The data is taken from the kaggle compitition.

The dataset is 3d x-ray scans of detached fragments of ancient papyrus scrolls.
The training data had 3 fragments.
It had the slices from the 3d x-ray surface volume. Each file contains a greyscale slice in the z-direction. Each fragment contains 65 slices. Combined this image stack gives us width * height * 65 number of voxels per fragment.

The 65 slices of the first fragment

The inklables were given which was a binary mask which showed 1 for the presense of the ink.
Further there was even the mask of the fragment which basically shows where the data is present in the fragment.

Training_notebooks: Contains the notebooks used for training. Much more description about them is given in the training readme.
Inference_notebooks: Contains the notebooks used for Inference. Much more description about them is given in the Inference readme.
tiles_extracted: This is the notebook that reduces the dataset by merging the volume scans with the mask. This removes the unecessary tiles for training the model

Training

Trained multiple U-Net models using the segmentation-models-pytorch.
Trained models with the backbone
1. mit_b2 , mitb3 , mit_b4 , mit_b5
2. VGG19
3. resnet50 resnet34
4. efficientnet_b3
5. resnest
6. regnety etc
Few of them can be found in the repo.
Trained a model using stratified K fold technique
During the training the fragments were broken into the sizes of 224 by 224 and kept a stride of 224//4
trained for 15 epochs by keeping one of the 3 fragments as the cross validation
The calculated metric was fbeta score keeping the beta value 0.5
Was getting the CV score of single model around 63
used the weighted loss of dice loss and the Tversky loss and BCE loss

fig: the first image is the actual mask, the second one is the predicted mask, third one is the predicted mask after applying a threshhold 0.4

Inference Time

Ensembles

Did the ensembles of different models trained, tried average ensemble.
Ensembling the models increased the fbeta score from 65 to 74.
Tried weighted ensembles tooo

TTA

Performed TTA(Test Time Augmentations) during the final inference.

8 Channels average score

So took the 8 channels and then predicted the output for the 1-3 , 4-6 , 6-8 channels and took the average of the predictions

arushikabansal / 3d_image_segmentation Goto Github PK

3d_image_segmentation's Introduction

3D_image_segmentation_via_deep_CNN

Description:

Dataset:

Contents

Training

Inference Time

Ensembles

TTA

8 Channels average score

3d_image_segmentation's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent