Git Product home page Git Product logo

customdpp's Introduction

Custom Display-Post-Processing Private Preview

Overview

Automatic Speech Recognition output display format is critical to downstream tasks and one-size doesn’t fit all. Custom Display-Post-Processing (aka "Custom DPP") allows users to define their own lexical-to-display format rules to improve the speech recognition service quality on top of Microsoft Azure Custom Speech Service.

cdpp demo

Features and capabilities

Supported features

  • Add rewrite rules to replace certain phrases from output with others, for capitalization, error correction, etc.;
  • Add profanity rules to mask or remove certain words from output;
  • Add advanced invert-text-normalization rules to format with certain display patterns

Supported language and locales

✅ en-US ✅ en-GB ✅ en-CA ✅ en-IN ✅ de-DE ✅ it-IT ✅ es-ES ✅ es-MX ✅ fr-CA ✅ fr-FR ✅ zh-CN ✅ ja-JP ✅ ko-KR ✅ nb-NO ✅ nl-NL (more locale supports are coming in the next versions).

Learn more about Speech service supported languages and locales

Supported service regions

✅ West US ✅ East US ✅ Central US ✅ West Central US ✅ North Europe ✅ Central India

Learn more about Speech service supported regions

Prerequisites

  • Set up your Azure account and Speech service account, and find your Speech service key and region (Learn more from Try the Speech service for free).
  • Set up your Custom Speech project and get the custom model ID. You can navigate to Speech Studio -> Custom Speech -> your custom speech project -> "Train custom models" tab -> click into one of your custom speech models, and copy the Model ID value from the card on the top (Learn more from Custom Speech overview).

Get start with Custom DPP CLI

The Custom DPP command-line utility presents easy-to-use commands for display format customization.

  • See the Get Start document about how to download and use the CLI tool to upload, evaluate, and deploy your custom display format models.
  • See the Concept document about the basic concepts of Microsoft display post processing.

Supported operations

The general format of Custom DPP CLI commands is: cdpp [command] [arguments] --[flag-name] [flag-value]

Available Commands:

  • init - Create and init a project, scaffolding it with template files
  • push - Push the custom model files or test set to Microsoft Speech server
  • eval - Evaluate the latest model pushed to the Microsoft Speech server
  • deploy - Deploy a custom DPP model and bind to a custom speech model
  • get - Get model, test set, evaluation or deployment detail information from Microsoft Speech server
  • check - Check the format/grammar of the custom rule dataset rewrite, profanity and ITN
  • config - Config subscription key or service region of a project, or get the current project settings
  • delete - Delete a model, test, eval or deploy from Microsoft Speech server
  • fetch - Fetch model files or test set from Microsoft Speech server
  • list - List models, test sets, evaluation or deployment summary information of the subscription from Microsoft Speech server
  • version - Show version information and supported regions, locales
  • help - Help about any command

Find help from your command prompt

For convenience, consider adding the Custom DPP CLI location to your system path for ease of use. That way you can type cdpp from any directory on your system.

To see a list of commands, type cdpp -h and then press the ENTER key.

To learn about a specific command, just include the name of the command (For example: cdpp push -h).

cdpp command help example

If you choose not to add Custom DPP CLI to your path, you'll have to change directories to the location of your cdpp executable and type cdpp or .\cdpp in Windows PowerShell command prompts.

Frequently asked questions

How to obtain the default base DPP behaviors as a baseline?

You can push model with empty model files, edit and push your test cases, start evaluation, then download the evaluation logs to obtain the default base DPP outputs as a baseline to start. Refer to this how-to doc for more best-practices.

Is unified speech-to-text service supported?

Not in private preview version. You have to specify a custom speech modelID (from Custom Speech Service) in order to enable Custom DPP builders. Unified Speech-to-text service without model customiztaion can only benefit from Microsoft built-in DPP base builders.

Any size limits for the Custom DPP rules?

Yes. For private preview version, the maximum size is 1,000 lines of rules for Rewrite and Profanity file, and 200 lines of rules for ITN file.

How long it will take to take effect on Speech service after deployming Custom DPP model?

After you deploy a Custom DPP model to online Speech Service (by typing cdpp deploy), you may wait for 1-3 hours before it takes effect to your custom model and endpoint, afterwards you can validate on the end-to-end audio to display output text process.

Also note that there are some service regions still being rolled out across Private Preview period, so please try again later if you encountered any deployment issues.

Is Bring-Your-Own-Storage (BYOS) supported?

Not in private preview version. We are trying to support this in later versions. Learn more about Speech Service BYOS at Speech service encryption of data at rest.

customdpp's People

Contributors

archeraz avatar chrido-msft avatar liuwei-git avatar microsoftopensource avatar nkibre avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

customdpp's Issues

Error: 'invalid character 'T' looking for beginning of value' after input subscription key

Dear Team,

I'm trying to test de CustomDPP for one of our STT models but the step with the input of the subscription key to create a new DPP model is not working:

C:\Users\KarenSuarezPulido\Documents\Trabajo\SALT\Microsoft\CustomDPP>cdpp init test_de_kitt
input locale name of project test_de_kitt, type ? to list all support locales
locale name: de-de
input service region name of project test_de_kitt, type ? to list all available regions
service region: westeurope
input your Azure cognitive service subscription key at the region westeurope
subscription key: c0xxx0x000000x000x00xx00x000000x

**invalid character 'T' looking for beginning of value**
  1. That isn't a 'T' in our key
  2. I'm a 100% sure that the key is the right one.
  3. I tried putting the key in ' ' and " " but I get this: subscription key should be a 32 bytes string, try again

I would appreciate any help!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.