Git Product home page Git Product logo

mmphego / computer-pointer-controller Goto Github PK

View Code? Open in Web Editor NEW
7.0 2.0 2.0 19.77 MB

Deep learning based Gaze Detection model to control the mouse pointer of your computer.

Home Page: https://blog.mphomphego.co.za/blog/2020/07/18/How-I-deployed-a-Computer-Pointer-Controller-using-Gaze-Estimation.html

License: GNU General Public License v3.0

Python 100.00%
edgeai openvino opencv python ai deep-learning openvino-toolkit openvino-docker face-detection facial-landmarks

computer-pointer-controller's Introduction

Computer Pointer Controller

Details
Programming Language: Python 3.6+
Intel OpenVINO ToolKit: 2020.2.120
Docker (Ubuntu OpenVINO pre-installed): mmphego/intel-openvino
Hardware Used: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
Device: CPU
Blog Post blogpost
Visitors

In this project 3 of 3, I used an Intel® OpenVINO Gaze Detection model to control the mouse pointer of my computer. Using the Gaze Estimation model to estimate the gaze of the user's eyes and change the mouse pointer position accordingly. This project demonstrates the ability of running multiple models in the same machine and coordinate the flow of data between those models.

How It Works

Used the Inference Engine API from Intel's OpenVino ToolKit to build the project.

The gaze estimation model used requires three inputs:

  • The head pose
  • The left eye image
  • The right eye image.

To get these inputs, use the three other OpenVino models model below:

Implementation:

class Face_Detection(Base):

face_Detection

Implementation:

class Facial_Landmarks(Base):

facial_landmarks

Implementation:

class Head_Pose_Estimation(Base):

head_pose

Implementation:

class Gaze_Estimation(Base):

all

Project Pipeline

Coordinate the flow of data from the input, and then amongst the different models and finally to the mouse controller. The flow of data looks like this:

image

Demo

vide-demo

Project Set Up and Installation

Directory Structure

tree && du -sh
.
├── LICENSE
├── main.py
├── models
│   ├── face-detection-adas-binary-0001.bin
│   ├── face-detection-adas-binary-0001.xml
│   ├── gaze-estimation-adas-0002.bin
│   ├── gaze-estimation-adas-0002.xml
│   ├── head-pose-estimation-adas-0001.bin
│   ├── head-pose-estimation-adas-0001.xml
│   ├── landmarks-regression-retail-0009.bin
│   └── landmarks-regression-retail-0009.xml
├── README.md
├── requirements.txt
├── resources
└── src
    ├── __init__.py
    ├── input_feeder.py
    ├── model.py
    └── mouse_controller.py

3 directories, 16 files
37M	.

Setup and Installation

There are two (2) ways of running the project.

  1. Download and install Intel OpenVINO Toolkit and install.

    • After you've cloned the repo, you need to install the dependecies using this command: pip3 install -r requirements.txt
  2. Run the project in the Docker image that I have baked Intel OpenVINO and dependencies in.

  • Run: docker pull mmphego/intel-openvino

Not sure what Docker is, watch this

For this project I used the latter method.

Models Used

I have already downloaded the Models, which are located in ./models/. Should you wish to download your own models run:

MODEL_NAME=<<name of model to download>>
docker run --rm -ti \
--volume "$PWD":/app \
mmphego/intel-openvino \
bash -c "\
    /opt/intel/openvino/deployment_tools/open_model_zoo/tools/downloader/downloader.py \
    --name ${MODEL_NAME}"

Models used in this project:

Application Usage

$ python main.py -h

usage: main.py [-h] -fm FACE_MODEL -hp HEAD_POSE_MODEL -fl
               FACIAL_LANDMARKS_MODEL -gm GAZE_MODEL [-d DEVICE]
               [-pt PROB_THRESHOLD] -i INPUT [--out] [-mp [{high,low,medium}]]
               [-ms [{fast,slow,medium}]] [--enable-mouse] [--show-bbox]
               [--debug] [--stats]

optional arguments:
  -h, --help            show this help message and exit
  -fm FACE_MODEL, --face-model FACE_MODEL
                        Path to an xml file with a trained model.
  -hp HEAD_POSE_MODEL, --head-pose-model HEAD_POSE_MODEL
                        Path to an IR model representative for head-pose-model
  -fl FACIAL_LANDMARKS_MODEL, --facial-landmarks-model FACIAL_LANDMARKS_MODEL
                        Path to an IR model representative for facial-
                        landmarks-model
  -gm GAZE_MODEL, --gaze-model GAZE_MODEL
                        Path to an IR model representative for gaze-model
  -d DEVICE, --device DEVICE
                        Specify the target device to infer on: CPU, GPU, FPGA
                        or MYRIAD is acceptable. Sample will look for a
                        suitable plugin for device specified (Default: CPU)
  -pt PROB_THRESHOLD, --prob_threshold PROB_THRESHOLD
                        Probability threshold for detections
                        filtering(Default: 0.8)
  -i INPUT, --input INPUT
                        Path to image, video file or 'cam' for Webcam.
  --out                 Write video to file.
  -mp [{high,low,medium}], --mouse-precision [{high,low,medium}]
                        The precision for mouse movement (how much the mouse
                        moves). [Default: low]
  -ms [{fast,slow,medium}], --mouse-speed [{fast,slow,medium}]
                        The speed (how fast it moves) by changing [Default:
                        fast]
  --enable-mouse        Enable Mouse Movement
  --show-bbox           Show bounding box and stats on screen [debugging].
  --debug               Show output on screen [debugging].
  --stats               Verbose OpenVINO layer performance stats [debugging].

Usage Example

In order to run the application run the following code (Assuming you have docker installed.):

xhost +;
docker run --rm -ti \
--volume "$PWD":/app \
--env DISPLAY=$DISPLAY \
--volume=$HOME/.Xauthority:/root/.Xauthority \
--volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" \
--device /dev/video0 \
mmphego/intel-openvino \
bash -c "\
    source /opt/intel/openvino/bin/setupvars.sh && \
    python main.py \
        --face-model models/face-detection-adas-binary-0001 \
        --head-pose-model models/head-pose-estimation-adas-0001 \
        --facial-landmarks-model models/landmarks-regression-retail-0009 \
        --gaze-model models/gaze-estimation-adas-0002 \
        --input resources/demo.mp4 \
        --debug \
        --show-bbox \
        --enable-mouse \
        --mouse-precision low \
        --mouse-speed fast"

Packaging the Application

We can use the Deployment Manager present in OpenVINO to create a runtime package from our application. These packages can be easily sent to other hardware devices to be deployed.

To deploy the application to various devices using the Deployment Manager run the steps below.

Note: Choose from the devices listed below.

DEVICE='cpu' # or gpu, vpu, gna, hddl
docker run --rm -ti \
--volume "$PWD":/app \
mmphego/intel-openvino bash -c "\
  python /opt/intel/openvino/deployment_tools/tools/deployment_manager/deployment_manager.py \
  --targets ${DEVICE} \
  --user_data /app \
  --output_dir . \
  --archive_name computer_pointer_controller_${DEVICE}"

OpenVino API for Layer Analysis

Queries performance measures per layer to get feedback of what is the most time consuming layer: Read docs.

xhost +;
docker run --rm -ti \
--volume "$PWD":/app \
--env DISPLAY=$DISPLAY \
--volume=$HOME/.Xauthority:/root/.Xauthority \
--volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" \
--device /dev/video0 \
mmphego/intel-openvino \
bash -c "\
    source /opt/intel/openvino/bin/setupvars.sh && \
    python main.py \
        --face-model models/face-detection-adas-binary-0001 \
        --head-pose-model models/head-pose-estimation-adas-0001 \
        --facial-landmarks-model models/landmarks-regression-retail-0009 \
        --gaze-model models/gaze-estimation-adas-0002 \
        --input resources/demo.mp4 \
        --stat"

Edge Cases

  • Multiple People Scenario: If we encounter multiple people in the video frame, it will always use and give results one face even though multiple people detected,
  • No Face Detection: it will skip the frame and inform the user

Future Improvement

  • Intel® DevCloud: Benchmark the application on various devices to ensure optimum perfomance.
  • Intel® VTune™ Profiler: Profile my application and locate any bottlenecks.
  • Gaze estimations: We could revisit the logic of determining and calculating the coordinates as it is a bit flaky.
  • Lighting condition: We might use HSV based pre-processing steps to minimize error due to different lighting conditions.

Reference

computer-pointer-controller's People

Contributors

mmphego avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.