Git Product home page Git Product logo

e-candeloro / vision-robotic-arm-gesture-recognition Goto Github PK

View Code? Open in Web Editor NEW
15.0 2.0 7.0 155.07 MB

A ROS workspace for the control of a 7 DOF or a 2 DOF robot using a webcam and a human operator body movements, with Python plus OpenCV and Mediapipe.

CMake 25.60% C++ 0.65% Python 73.75%
armcontrol body-keypoints computervision franka-emika franka-panda-python mediapipe opencv pose-estimation realtime robotcontrol

vision-robotic-arm-gesture-recognition's Introduction

Vision Robotic Arm Gesture Recognition

What is this project about?

This repository contains a ROS workspace for the control of a 7 DOF robot (Panda by Franka Emika) or a 2 DOF robot (RRbot) using a webcam, with the movement of the operator body.

Demo Panda

Short_demo.mp4

Demo RRbot

Short_demo_rrbot.mp4

Why this project?

This is a work done for the course of Smart Robotics taught at University of Modena and Reggio Emilia (UNIMORE) in the second semester of the academic year 2021/2022.

This project aims to demostrate a proof of concept using both computer vision algorithms and robot simulation and control, combining Python,OpenCV, Mediapipe and ROS + Gazebo.

In fact, controlling remotely a robot can be useful in dangerous environments, for medical assistance and for simulation/testing. Vision control for anthropomorphic robots can be a new way to work with remote operations in a simpler manner.

Project Team

How this repository is organized

NOTE: In every folder, there are README.md files that contains all the instruction to follow for installing ROS and setting up correctly the workspaces.

  • In the Demo_1_RRbot_Control we implemented a first version of our project with a simpler 2 DOF robot.
  • In the Demo_2_Panda_Control we implemented a second version of our project with a more complex robot and control with both the hand aperture and pose angles.
  • In the project_presentation folder, a Power Point file containing a brief description of the overall work is present.
  • In the images and videos folders all the media used for this repository and the ppt presentation can be found.

High Level Architecture

Project Architecture

  1. The video stream of the Operator is taken with a webcam/camera
  2. Each frame is passed to the vision_arm_control of the ROS workspace.
  3. The vision script detects the hand and pose keypoints and then compute the hand aperture and pose angles
  4. The hand aperture and angles are sent to the Panda robot in the Gazebo simulation via a Pub/Sub ROS node
  5. The robot is simulated and controlled almost real-time thanks to the Panda Simulator Interfaces packages.

Vision Algorithm and Package

For the vision part of the project, we used the OpenCV and Mediapipe libraries. We created a HandDetector and PoseDetector modules to extract the hand aperture and pose angles from a video stream. In the main.py script we capture the video footage and also use a pub/sub to interact with the robot.

Hand Detector and computing the hand aperture

The hand detector class is composed of the following methods:

hand_det_class

Hand detector demo

Hand Aperture pipeline

  1. The four hand index tips are located, along the base of the hand (middle point between wrist and thumb base)
  2. The median point between the fingers tips (thumb excluded) is computed
  3. The L2 distance between the base of the hand point and the median fingers position is computed.
  4. The distance is then normalized by the palm size and remapped between the value 0 and 100

Pose Detector and computing the pose angles

The hand detector class is composed of the following methods:

pose_det_class

Pose detector demo

Pose Angles pipeline

To extract the angles between a set of 3 keypoints (like the elbow angle given the points 12, 14, 16) we make use of the property of the scalar product between two vectors.

  1. We create two vectors with the same origin (at the central keypoint, where we want to know the angle value)
  2. We use the following formula to compute the angle between those two vectors: $\phi = \arccos (\frac{a\cdot b}{|a||b|})$
  3. With this method we can consider both the 2D and 3D keypoints for more robustness!

ROS and Robot Simulation Package

For the robot simulation and control part of the project, we used the ROS with Gazebo. We created ad hoc ROS nodes with a publisher and subsriber to various topics to control the RRbot while we leveraged the Panda Simulator Package for the Panda arm control.

Both the two demos make use of a pub/sub mechanism but in case of the Panda Control package, the publishing is done with a list of joint values.

Panda Arm control

Focusing on the second and final demo of this project,for moving our robot we use:

  • One hand for the sixth joint
  • Right-elbow angle for the fourth joint
  • Left-elbow angle for the second joint
  • Left-shoulder angle for the first joint

The other 3 joints are fixed 

Future Improvements

  • Gestures complexity and usability: Increase the set of possible gestures by adding different types of movements and improving the usability.
  • Test the prototype on a high-end machine, then transpose it over a real robot.
  • Divide each node into a standalone machine, which will be connected to the master nodes remotely.
  • Improve the connectivity of the solution, by exploiting camera and devices that could be connected remotely.

vision-robotic-arm-gesture-recognition's People

Contributors

e-candeloro avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

vision-robotic-arm-gesture-recognition's Issues

DEMO_2_PANDA ROSLAUNCH ERROR

RLException: [rrbot_world.launch] is neither a launch file in package [rrbot_gazebo] nor is [rrbot_gazebo] a launch file name
The traceback for the exception was written to the log file

I am unable to launch the gazebo roslaunch file, the buld is successful and the demo1 works perfectly fine. The only problem arises when I run this line : roslaunch rrbot_gazebo rrbot_world.launch for the DEMO2. This is the same line mentioned in the steps
Screenshot from 2024-02-06 12-14-20
Screenshot from 2024-02-06 12-14-57

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.