Git Product home page Git Product logo

tinycog's Introduction

TinyCog

A collection of speech, vision, and movement functionalities aimed at small or toy robots on embedded systems, such as the Raspberry Pi computer. High level reasoning, language understanding, language gneration and movement planning is provided by OpenCog.

The current hardware platform requires an RPI3 computer, a Pi Camera V2 and a USB Microphone; other sensor/detector components are planned.

The current software functions include face detection, emotion recognition, gesture analysis, speech-to-text and text-to-speech subsystems. All high-level founction is provided by OpenCog, and specifically by the Ghost scripting system -- ghost is able to process sensory input, and provide coordinated chatbot and movement abilities.

Setup

Everything here is meant to run on a rpi3 computer; one can also compile everything on a standard Linux desktop computer.

A fully prepped raspbian image is available here

Use xzcat to clone the image as shown here replacing sdX with your device.

    xzcat oc-debian-stretch-arm64.img.xz | sudo dd of=/dev/sdX

When you first boot with this image, and login with the default credentials, it automatically expands the filesystem to occupy the entire / partition and then it reboots. This is a 64bit Debian Stretch OS for RPI3 which means that PiCamera driver is not available. A USB camera should be used for this image.

The default credentials:

    Username: oc
    Password: opencog

The image contains the opencog version at the time of it's building and other libraries such as opencv and dlib (3.4 and 19.15). To see the opencog commit version, pkg-config can be used.

    pkg-config --variable=cogutil opencog    #shows cogutil commit version
    pkg-config --variable=atomspace opencog    #shows opencog commit version
    pkg-config --variable=opencog opencog    #shows opencog commit version

There is a problem with this image, no driver for piCamera as it's not available as a 64bit binary.

Install

Need to have these whether on desktop or rpi

Use cmake for building. Default build mode is Debug mode. Set CMAKE_BUILD_TYPE to Release to disable debug mode. For the emotion recognition service you should set the variable SERVER_ADDRESS to "34.216.72.29:6205"

    cd to TinyCog dir
    mkdir build
    cd build
    cmake ..  # -DCMAKE_BUILD_TYPE=Release -DSERVER_ADDRESS="34.216.72.29:6205" 
    make

Testing

  • To test the sensors from the guile shell, run the following from within the build dir which opens up the camera and does a live view of the camera with markings for the sensors.
    $ ./TestDrRoboto.scm
  • To test from a video file instead of a camera, run the folloiwng way
    $ ./TestDrRoboto.scm -- <video_file_path>

Running

    $ guile -l dr-roboto.scm
  • In another terminal, connect to port 5555 via telnet to input speech
    $ telnet localhost 5555

Implementation

Overall Description

The dr-roboto.cpp file is compiled to a guile extension which is loaded with the scheme dr-roboto.scm file. This guile extension is written in C++ and it's main job is to open the camera and sense stuff. When the scheme program loads the extension, the first thing it does is it sends the address of its atomspace to the extension so that the two can share an atomspace. Then the sensors are started which is a loop run in a separate thread that just collects information and places them in the atomspace. Most sensory values are stored with Atomspace Values in the following format:

    Value
        ConceptNode "position"
        ConceptNode "face_x"
        FloatValue X Y

The scheme program dr-roboto.scm includes the behavior/behavior.scm code that contains a very small model of OpenCog behavior tree. The behavior is a looking behavior which first goes through checking if there is a face, if there is only one then just look at that one, if there are more than one then check if one of them is smiling, if not, check if any of them has a non-neutral facial expression, if not just look at one of the faces randomly. If there are no faces in view then just look at the salient point. The behavior tree calls functions that simply check the atomspace for the information they require. The behavior/behavior.scm file also loads the ghost scripts located here When the behavior program wants to command the Professor Einstein robot of Hanson Robotics, it calls functions defined in cmd_einstein.scm This program connects to the Professor Einstein robot through its socket api and sends it commands. The fuctions.scm file contains some utility functions used by other scheme source files such as converting ghost results which are a list of WordNodes to a single string to be spoken by the robot and mapping of values between the image dimensions and the robot's pan/tilt limit.

Sensors

Sensors are a camera and microphones. The camera for face detection -> face landmark -> facial expression and emotion and also for hand detection -> convexity defects -> fingers count and gestures. The microphone for STT and sound source localization. Some of the sensor programs such as the face, hand and voice activity detection can run on the PI without much stress on the hardware but other functionalities like emotion and and speech recogntion should be implemented as services from a server possibly from singnet. Currently, STT is implemented using pocketsphinx. It's not ideal but can be used for a very limited range of commands and simple conversation.

Act

For Hanson Robotic's Professor Einstein robot, the cmd_einstein.scm file contains the code necessary to command it. The robot should also act as well as sense. It must speak and move around. The speech synthesis utilizes festival. The code is in act/audio. Movement was intended to be with SPI communication with the hardware but that has changed. However the spi interface is in comm/spi

Behavior

ToDo

  • Improve STT
  • Ghost rules
  • Stories for a specific identity we need the robot to have

tinycog's People

Contributors

amaneth avatar dagiopia avatar eskender-b avatar geta-meko avatar kidist-abraham avatar linas avatar ngeiswei avatar noahbliss avatar yantrabuddhi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tinycog's Issues

State grpc as a dependency

We should state grpc as a dependency on the README since someone shouldn't realize that she need to install grpc after she got a building error.

Description of intent, contents

Can you please create a README file that describes:

  • the purpose and goals of this project.
  • what sort of software is will be found/is supposed to be included here,
  • what the software actually does
  • what is currently in here (vs what is planned for the future)
  • how it is used
  • any special or unusual hardware required

List of features ready to be reviewed - Demo(V1)

Going to our first major demo, appropriately listing all the things that are done and ready for consumption so that we could see how far or near we are to a first release.

So what are the status of:

  • Ghost Integration
  • Speech to Text integration
  • Basic hardware requirement specification for speech recognition to run on raspberry pi. Something akin to recommended software to capture audio from the environment.
  • Text To Speech integration
  • Basic hardware requirements for decent audio output.
  • Cross-compile toolchain stability tests. Something similar to continuous integration tools that lets evaluate validity of chain.
  • Benchmarks for timing tests for different visual, audio and nlp libraries incorporated in this software package.
  • Also check on button startup to run the demo.
  • Check on button deploy on a empty memory card. Do we have an image people can download and run on their setup.
  • Make our images publicly available for easy of consumption by various people across the glob.
  • List the softwares that are installed on the our image. Most people may need to inquire about what packages that we have installed in our toolchain. So listing all packages that are available in the images we provide.

For now, this are the things that are on top of my mind. In time we would see if there are additional things to be added in our first demo.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.