Git Product home page Git Product logo

domain-and-view-point-agnostic-hand-action-recognition's Introduction

Domain and View-point Agnostic Hand Action Recognition

PWC

PWC

[Paper] [Supplementary video]

This repository contains the code to train and evaluate the work presented in the article Domain and View-point Agnostic Hand Action Recognition.

Motion representation model

@inproceedings{sabater2021domain,
  title={Domain and View-point Agnostic Hand Action Recognition},
  author={Sabater, Alberto and Alonso, I{\~n}igo and Montesano, Luis and Murillo, Ana C},
  booktitle={2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  year={2021},
}

Pre-trained models

Download the desired models used in the paper and store them under ./pretrained_models/.

Cross-domain

Intra-domain

Data format

The present project uses skeleton representations based on the 20-joints that SHREC17 and F-PHAB have in common.

In F-PHAB Dataset: Wrist, TPIP, TDIP, TTIP, IMCP, IPIP, IDIP, ITIP, MMCP, MPIP, MDIP, MTIP, RMCP, RPIP, RDIP, RTIP, PMCP, PPIP, PDIP, PTIP

In SHREC-17 Dataset: Wrist, thumb_first_joint, thumb_second_joint, thumb_tip, index_base, index_first_joint, index_second_joint, index_tip, middle_base, middle_first_joint, middle_second_joint, middle_tip, ring_base, ring_first_joint, ring_second_joint, ring_tip, pinky_base, pinky_first_joint, pinky_second_joint, pinky_tip

The 7-joints minimal skeleton representation proposed in the paper uses the skeleton joints indexed by 0,8,3,7,11,15,19, which stands for Wrist, middle_base, thumb_tip, index_tip, middle_tip, ring_tip, pinky_tip.

Skeleton representations

Original skeletons files can be transformed to the 20-joints format with the scripts for SHREC17 and F-PHAB, and stored under ./datasets/. Minimal 7-joints format is later obtained with the DataGenerator.

F-PHAB data splits used for the evaluation are located under ./dataset_scripts/F_PHAB/paper_tables_annotations/.

Python dependencies

Project tested with the following dependencies:

  • python 3.6
  • tensorflow 2.3.0
  • Keras 2.3.1
  • keras-tcn 3.1.0
  • scikit-learn 0.22.2
  • scipy 1.4.1
  • pandas 1.0.3

Cross-domain action recognition evaluation

Execute the file action_recognition_evaluation.py to perform the cross-domain evaluation reported in the paper. The script loads skeleton action sequences, augments them, generates sequence embeddings and performs the KNN classification. Final results show the accuracy calculated both with and without motion reference set augmentation. Use the following flags to evaluate different datasets. --eval_fphab, --eval_msra

To reproduce the results given in the Table III from the paper, download the cross-domain models and execute the following commands:

python action_recognition_evaluation.py --path_model ./pretrained_models/xdom_last_descriptor --loss_name mixknn_train5_val1 --eval_fphab

python action_recognition_evaluation.py --path_model ./pretrained_models/xdom_summarization --loss_name mixknn_best --eval_fphab --eval_msra

Note that, since random operations are involved in the evaluation, final results can slightly differ from the results reported in the paper.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.