Git Product home page Git Product logo

unity-ms-speechsdk's Introduction

Unity-MS-SpeechSDK

Sample Unity project used to demonstrate Speech Recognition (aka Speech-to-Text) using the new Microsoft Speech Service (currently in Preview) via WebSockets. The Microsoft Speech Service is part of Microsoft Azure Cognitive Services. This is a work in progress.

  • Unity version: 2018.2.5f1
  • Speech Service version: 0.6.0 (Preview)
  • Target platforms tested: Unity Editor/Mono, Windows Desktop x64, Android, UWP (to be tested: iOS)

Screenshot

Implementation Notes

  • This sample uses the Speech Service WebSocket protocol to interact with the Speech Service and generate speech recognition hypotheses in real-time.
  • This sample is compatible with both the new Cognitive Services Speech Service (Preview) and the classic Bing Speech API. The default and recommended approach is the new service.
  • You will need an Azure Cognitive Services account to use this sample: Create an account here.
  • If you see any API keys in the code, these are either trial keys that will expire soon or temporary keys that may get invalidated. Please get your own keys. Get your own trial key to Bing Speech or the new Speech Service here. A free tier is available allowing 5,000 transactions per month, at a rate of 20 per minute.
  • This sample supports two methods to perform speech recognition:
  • METHOD #1: The UI Canvas button Start Recognizer from File uploads a speech audio file to perform the recognition. You can use the buttons Start Recording & Stop to first record an audio file. The sample uses the same audio file for recording and recognition upload by default.
  • METHOD #2: The UI Canvas button Start Recognition from Microphone uses the default microphone to record the user's voice and send audio packets for real-time recognition. The service automatically detects silences and issues an end of speech event to stop the microphone recording automatically.
  • The speech recognition results are posted in the UI Canvas Text label as well as the Unity Debug Console window in real-time as the paudio packets are received by the Speech service.
  • The SpeechManager also includes support for client-side silence detection, which has configurable parameters. Thanks to my colleague Jared Bienz for this feature.
  • NOTE: This project contains incomplete artifacts in progress.

Resource Links

Follow Me

unity-ms-speechsdk's People

Contributors

activenick avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.