Git Product home page Git Product logo

handdetection_maskrcnn's Introduction

HandDetection_MaskRCNN

This project is a modification of the Mask R-CNN codebase for hand detection. [https://github.com/tcnshonen/pytorch-mask-rcnn].

How to run the notebook

Binder

Colab

  1. You would need to download the preprocessed datasets from this link.
  2. Open the notebook in either mybinder or Google Colab (https://colab.research.google.com/github/satyajeetmaharana/HandDetection_MaskRCNN/blob/master/Mask%20R%20CNN%20Model%20for%20Hand%20Detection_Notebook.ipynb)

The Mask R-CNN model is trained with the EgoHands Dataset [http://vision.soic.indiana.edu/projects/egohands/].

There are mainly three steps I have followed: (i) Created a dataloader to process the data (ii) Built a script for training the model (iii) Built an evaluation script to evaluate the model

Dataset

The EgoHands dataset contains in total 4800 labeled images from 48 Google Glass videos. The ground-truth labels consist of polygons with hand segmentations and are provided as Matlab files. You need to convert them to masks and compute the minimal bounding box. Notice that some images might not contain any hand at all and you might want to omit these images. There is no train/val split in the original dataset and you should split it yourself.

Model

The complete Mask R-CNN architecture is in model.py. However, to better adjust Mask R-CNN to EgoHands, you are free to modify model.py and use any tricks you learnt from any research literature. A train_model function can be found in model.py. You can use this function to train the model or you can make your own training pipeline if you want.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.