Git Product home page Git Product logo

3d_image_segmentation's Introduction

3D_image_segmentation_via_deep_CNN

Description:

  • These models were trained for the kaggle compitition and won a BRONZE medal.

  • Given a fragment, this segmentation model detects the ink.

  • The output of the model is a binary mask where the 1 represents the presence of ink and 0 for no ink.

  • This above mask is then converted into the RLE(Run Length Encodings).

    RLE

The following example has the letter "E" it's RLE encoding is: "11 7 20 1 23 1 26 1 29 1 32 1 35 1 38 1 44 1"

Dataset:

  • The dataset is 3d x-ray scans of detached fragments of ancient papyrus scrolls.

  • The training data had 3 fragments.

  • It had the slices from the 3d x-ray surface volume. Each file contains a greyscale slice in the z-direction. Each fragment contains 65 slices. Combined this image stack gives us width * height * 65 number of voxels per fragment.

    The 65 channels

The 65 slices of the first fragment

  • The inklables were given which was a binary mask which showed 1 for the presense of the ink.

    labels
  • Further there was even the mask of the fragment which basically shows where the data is present in the fragment.

    The 65 channels

Contents

  • Training_notebooks: Contains the notebooks used for training. Much more description about them is given in the training readme.
  • Inference_notebooks: Contains the notebooks used for Inference. Much more description about them is given in the Inference readme.
  • tiles_extracted: This is the notebook that reduces the dataset by merging the volume scans with the mask. This removes the unecessary tiles for training the model

Training

  • Trained multiple U-Net models using the segmentation-models-pytorch.

  • Trained models with the backbone

    1. mit_b2 , mitb3 , mit_b4 , mit_b5
    2. VGG19
    3. resnet50 resnet34
    4. efficientnet_b3
    5. resnest
    6. regnety etc
  • Few of them can be found in the repo.

  • Trained a model using stratified K fold technique

  • During the training the fragments were broken into the sizes of 224 by 224 and kept a stride of 224//4

  • trained for 15 epochs by keeping one of the 3 fragments as the cross validation

  • The calculated metric was fbeta score keeping the beta value 0.5

  • Was getting the CV score of single model around 63

  • used the weighted loss of dice loss and the Tversky loss and BCE loss

fig: the first image is the actual mask, the second one is the predicted mask, third one is the predicted mask after applying a threshhold 0.4

Inference Time

Ensembles

  • Did the ensembles of different models trained, tried average ensemble.
  • Ensembling the models increased the fbeta score from 65 to 74.
  • Tried weighted ensembles tooo

TTA

  • Performed TTA(Test Time Augmentations) during the final inference.

8 Channels average score

  • So took the 8 channels and then predicted the output for the 1-3 , 4-6 , 6-8 channels and took the average of the predictions

3d_image_segmentation's People

Contributors

vishak-bhat30 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.