Git Product home page Git Product logo

har-using-vitpose-and-lstm's Introduction

Human Action Recognition using Detectron2 and LSTM

This repository contains code for Human Action Recognition using Detectron2 and LSTM blogpost.

download

This is an application built to show how human action classification can be done using 2D Pose Estimation and LSTM RNN machine learning models.

2D pose estimation is done using Facebook AI Research's Detectron2. A LSTM model is used to classify actions from 2D pose estimation output from a sequence of consecutive frames on a video.

We also provide the colab notebook:

LSTM training

We have trained our own LSTM model from data set given under https://github.com/stuarteiffert/RNN-for-Human-Activity-Recognition-using-2D-Pose-Input.

Since we are using Detectron2 for pose estimation, we have retrofitted the dataset to map to Detectron2 output format for training our LSTM model. (Original dataset is created using OpenPose library which output 18 keypoints (or more) per human from the input image while Detectron2 produces only 17 keypoints).

Model classifies the action into 6 categories

  • JUMPING
  • JUMPING_JACKS
  • BOXING
  • WAVING_2HANDS
  • WAVING_1HAND
  • CLAPPING_HANDS

Speciality about this app is that we are using pytorch libraries for both Pose Detection and LSTM. Detectron2 is written using pytorch. Our LSTM model is built using pytorch and is trained using pytorch-lightining. pytorch-lightning makes the training code very easy and concise.

We have trained our LSTM model and have saved the best model at models/saved_model.ckpt, same is used in the application for inferencing when you run the application.

Validation Loss
validation loss
Validation Accuracy
validation accuracy

Running the application

You need a machine with GPU to run our application. So, the easy way to run the application is to run it on google colab. Use the google colab notebook provided below to run the application. Open a browser with the application url you got while running the notebook.

You will see two options to do action classification

  1. You can upload your own input video and click on 'Submit' botton to see the action classification results. If you are uploading your own video, make sure that video has a single person in it and is performing one of 6 the actions mentioned above. Also make sure that to upload a small video snippet, otherwise it might be very slow to process.
  2. You can use the sample video you see on the page and click on 'Submit' button to see the results. Thanks to my son, who was kind enough to perform few actions and allowing me to record him :).

When you click on 'Submit', you will see a progress bar showing the progress and once done, you can download the output from the 'Download Result File' link. If you have submitted using the sample video given on the page, you can see that our model was able to classify the actions decently well. Actions infered from the video are shown on the out put video.

Jupyter notebooks

We have provided jupytyer notebooks to run the application

  1. To run the application (for example in google colab), use human_action_classification.ipynb
  2. To train the LSTM model, use lstm_train.ipynb (You don't need this unless you want to retrain the model with your own data set or experiment with it.)

har-using-vitpose-and-lstm's People

Contributors

ncchen99 avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.