Git Product home page Git Product logo

veremin's Introduction

GitHub Action deploy status


veremin

Veremin

Veremin is a video theremin based on PoseNet and the brainchild of John Cohn.

It builds upon the PoseNet Camera Demo and modifies it to allow you to make music by moving your hands/arms in front of a web camera.

PoseNet is used to predict the location of your wrists within the video. The app takes the predictions and converts them to tones in the browser or to MIDI values which get sent to a connected MIDI device.

Browsers must allow access to the webcam and support the Web Audio API. Optionally, to integrate with a MIDI device the browser will need to support the Web MIDI API (e.g., Chrome browser version 43 or later).

If you would like to use the pose estimation to control another device you can turn on MQTT to publish the data to an MQTT broker (that supports WebSockets). Other devices or application can then subscribe to receive the positional data.

Watch the video

Featured tools & technologies

  • PoseNet - a machine learning model which allows for real-time human pose estimation in the browser
  • TensorFlow.js - a JavaScript library for training and deploying ML models in the browser and on Node.js
  • Web MIDI API - an API supporting the MIDI protocol, enabling web applications to enumerate and select MIDI input and output devices on the client system and send and receive MIDI messages
  • Web Audio API - a high-level Web API for processing and synthesizing audio in web applications
  • Tone.js - a framework for creating interactive music in the browser
  • MQTT - a lightweight publish/subscribe messaging protocol for communicating with IoT devices
  • WebSocket API - an interface for sending messages to a server and receive event-driven responses without having to poll the server
  • Paho JavaScript Client - MQTT client library written in JavaScript that uses WebSockets to connect to an MQTT Broker

Live demo

To see the Veremin in action without installing anything, simply visit:

https://ibm.biz/veremin

For best results, you may want to use the Chrome browser and have a MIDI synthesizer (hardware or software) connected. See the Using the app section below for more information.

Steps

Follow one of these steps to deploy your own instance of Veremin.

Deploy to IBM Cloud

Pre-requisites:

To deploy to the IBM Cloud, from a terminal run:

  1. Clone the veremin locally:

    $ git clone https://github.com/vabarbosa/veremin
    
  2. Change to the directory of the cloned repo:

    $  cd veremin
    
  3. Log in to your IBM Cloud account:

    $ ibmcloud login
    
  4. Target a Cloud Foundry org and space:

    $ ibmcloud target --cf
    
  5. Push the app to IBM Cloud:

    $ ibmcloud cf push
    

    Deploying can take a few minutes.

  6. View the app with a browser at the URL listed in the output.

    Note: Depending on your browser, you may need to access the app using the https protocol instead of the http

Run locally

To run the app locally:

  1. From a terminal, clone the veremin locally:

    $ git clone https://github.com/vabarbosa/veremin
    
  2. Point your web server to the cloned repo directory (/veremin)

    For example:

    • using the Web Server for Chrome extension (available from the Chrome Web Store)

      1. Go to your Chrome browser's Apps page (chrome://apps)
      2. Click on the Web Server
      3. From the Web Server, click CHOOSE FOLDER and browse to the cloned repo directory
      4. Start the Web Server
      5. Make note of the Web Server URL(s) (e.g., http://127.0.0.1:8887)
    • using the Python HTTP server module

      1. From a terminal shell, go to the cloned repo directory
      2. Depending on your Python version, enter one of the following commands:
        • Python 2.x: python -m SimpleHTTPServer 8080
        • Python 3.x: python -m http.server 8080
      3. Once started, the Web Server URL should be http://127.0.0.1:8080
  3. From your browser, go to the Web Server's URL

Using the app

At a minimum, your browsers must allow access to the web camera and support the Web Audio API.

In addition, if it supports the Web MIDI API, you may connect a MIDI synthesizer to your computer. If you do not have a MIDI synthesizer you can download and run a software synthesizer such as SimpleSynth.

If your browser does not support the Web MIDI API or no (hardware or software) synthesizer is detected, the app defaults to using the Web Audio API to generate tones in the browser.

Publishing to an MQTT broker over WebSockets is also possible. You can configure the broker to send messages to. Some keypoints returned by the PoseNet model along with some additional computed values (i.e., distance, angle, etc.) are sent to the broker.

Open your browser and go to the app URL. Depending on your browser, you may need to access the app using the https protocol instead of the http. You may also have to accept the browser's prompt to allow access to the web camera. Once access is allowed, the PoseNet model gets loaded (it may take a few seconds).

After the model is loaded, the video stream from the web camera will appear and include an overlay with skeletal and joint information detected by PoseNet. The overlay will also include two adjacent zones/boxes. When your wrists are detected within each of the zones, you should here some sound.

  • Move your right hand/arm up and down (in the right zone) to generate different notes
  • Move your left hand/arm left and right (in the left zone) to adjust the velocity of the note.

Click on the Controls icon (top right) to open the control panel. In the control panel you are able to change MIDI devices (if more than one is connected), configure PoseNet settings, set what is shown in the overlay, enable MQTT, and configure additional options. More information about the control panel options is available here.

Links

veremin's People

Contributors

vabarbosa avatar dschonholtz avatar

Stargazers

 avatar  avatar  avatar Erasmo Bellumat avatar xoao avatar Aarya Jha avatar  avatar Max Wofford avatar Luis Marroquin avatar Simone Gosetto avatar Binh Minh An Nguyen avatar Rahul Baraiya avatar  avatar  avatar M̵̞̗̝̼̅̏̎͝Ȯ̴̝̻̊̃̋̀Õ̷̼͋N̸̩̿͜ ̶̜̠̹̼̩͒ avatar Andy Tang avatar  avatar Raaj Tilak Sarma avatar Jonathan Beri avatar Tim Hogan avatar Daiki Kimura avatar Yuehai avatar  avatar Naiyarah Hussain avatar Alvaro Obyrne avatar Sashko avatar Nick Kasten avatar

Watchers

john cohn avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.