Git Product home page Git Product logo

dataflow-contact-center-speech-analysis's Introduction

Speech Analysis Framework

This repository contains the Speech Analysis Framework, a collection of components and code from Google Cloud that you can use to transcribe audio, and create a data pipeline workflow to display analytics of the transcribed audio files.

It can and:

  • Process uploaded audio files to Cloud Storage.
  • Enrich the processed audio files with Cloud Speech-to-Text and Cloud Natural Language APIs.
  • Write the enriched data to BigQuery.
  • Redact sensitive information with Cloud Data Loss Prevention.

You can:

  • Gain insights into quality metrics to track such as call silence, call duration, agent speaking time, user speaking time, and sentence heat maps.
  • Build visualizations reports
  • Examples of what the visualizations you can build:

Speech Analysis Framework Sentence Heat Map

Speech Analysis Framework Limitations:

  • The framework can only identify two callers on a stereo or mono audio file. This is a limitation within the Framework code not Cloud Speech-to-Text API.
  • The framework can only process .wav or .flac files. This is a limitation within the Framework code not Cloud Speech-to-Text API.

The process follows:

  1. An audio file is uploaded to Cloud Storage
  2. The Cloud Function is triggered on object.create
  3. The Cloud Function sends a long running job request to Cloud Speech-to-Text
  4. The Cloud Function then sends the job ID from Cloud Speech-to-Text with additional metadata to Cloud Pub/Sub
  5. The Cloud Dataflow job enriches the data, optionally redacts sensitive information and writes to BigQuery

To Learn More visit Visualize speech data with Speech Analysis Framework

How to install the Speech Analysis Framework

  1. Install the Google Cloud SDK

  2. Create a storage bucket for Dataflow Staging Files

gsutil mb gs://[BUCKET_NAME]/
  1. Through the Google Cloud Console create a folder named tmp in the newly created bucket for the DataFlow staging files

  2. Create a storage bucket for Uploaded Audio Files

gsutil mb gs://[BUCKET_NAME]/
  1. Create a BigQuery Dataset
bq mk [YOUR_BIG_QUERY_DATABASE_NAME]
  1. Create Cloud Pub/Sub Topic
gcloud pubsub topics create [YOUR_TOPIC_NAME]
  1. Enable Cloud Dataflow API
gcloud services enable dataflow
  1. Enable Cloud Speech-to-Text API
gcloud services enable speech
  1. Enable Cloud Natural Language API
gcloud services enable language.googleapis.com
  1. Enable DLP Optional
gcloud services enable dlp.googleapis.com
  1. Deploy the Google Cloud Function
  • In the cloned repo, go to the “saf-longrun-job-func” directory and deploy the following Cloud Function.
gcloud functions deploy safLongRunJobFunc --region=us-central1 --stage-bucket=[YOUR_UPLOADED_AUDIO_FILES_BUCKET_NAME] --runtime=nodejs8 --trigger-event=google.storage.object.finalize --trigger-resource=[YOUR_UPLOADED_AUDIO_FILES_BUCKET_NAME]
  1. Deploy the Cloud Dataflow Pipeline
  • python3 --version Python 3.7.8
  • In the cloned repo, go to “saf-longrun-job-dataflow” directory and deploy the Cloud Dataflow Pipeline. Run the commands below to deploy the dataflow job.
# Apple/Linux
python3 -m venv env
source env/bin/activate
pip3 install apache-beam[gcp]
pip3 install dateparser

or

# Windows
python3 -m venv env
env\Scripts\activate
pip3 install apache-beam[gcp]
pip3 install dateparser
  • The Dataflow job will create the BigQuery Table you listed in the parameters.
  • Please wait as it might take a few minutes to complete.
python3 saflongrunjobdataflow.py --project=[YOUR_PROJECT_ID] --input_topic=projects/[YOUR_PROJECT_ID]/topics/[YOUR_TOPIC_NAME] --runner=DataflowRunner --temp_location=gs://[YOUR_DATAFLOW_STAGING_BUCKET]/tmp --output_bigquery=[DATASET NAME].[TABLE] --requirements_file="requirements.txt"
  1. In the cloned repo, go to “sample-audio-files” to locate sample audio files to process by Speech Analysis Framework
  • For the [TOPIC_NAME], do not include the full path, just the name of the TOPIC
  • Choose true or false to run DLP. DLP will use all info types to scan the data.
# stereo wav audio sample
gsutil -h x-goog-meta-dlp:[true or false] -h x-goog-meta-callid:1234567 -h x-goog-meta-stereo:true -h x-goog-meta-pubsubtopicname:[TOPIC_NAME] -h x-goog-meta-year:2019 -h x-goog-meta-month:11 -h x-goog-meta-day:06 -h x-goog-meta-starttime:1116 cp [YOUR_FILE_NAME.wav] gs://[YOUR_UPLOADED_AUDIO_FILES_BUCKET_NAME]
# mono flac audio sample
gsutil -h x-goog-meta-dlp:[true or false] -h x-goog-meta-callid:1234567 -h x-goog-meta-stereo:false -h x-goog-meta-pubsubtopicname:[TOPIC_NAME] -h x-goog-meta-year:2019 -h x-goog-meta-month:11 -h x-goog-meta-day:06 -h x-goog-meta-starttime:1116 cp [YOUR_FILE_NAME.flac] gs://[YOUR_UPLOADED_AUDIO_FILES_BUCKET_NAME]
  1. After a few minutes you will be able to see the data in BigQuery.
  • Sample select statements that can be executed in the BigQuery console.
-- Order Natural Language Entities for all records
SELECT
  *
FROM (
  SELECT
    entities.name,
    entities.type,
    COUNT(entities.name) AS count
  FROM
    `[YOUR_PROJECT_ID].[YOUR_DATASET].[YOUR_TABLE]`,
    UNNEST(entities) entities
  GROUP BY
    entities.name,
    entities.type
  ORDER BY
    count DESC )
-- List word, start time, end time, speaker tag and confidence for all records
SELECT
  ARRAY(
  SELECT
    AS STRUCT word,
    startSecs,
    endSecs,
    speakertag,
    confidence
  FROM
    UNNEST(words)) transcript
FROM
  `[YOUR_PROJECT_ID].[YOUR_DATASET].[YOUR_TABLE]`
-- Search Transcript with a regular expression
SELECT
  transcript,
  fileid,
  callid,
  year,
  month,
  day,
  sentimentscore,
  magnitude,
  date,
  silencesecs
FROM
  `[YOUR_PROJECT_ID].[YOUR_DATASET].[YOUR_TABLE]`
WHERE
  (REGEXP_CONTAINS(transcript, '(?i) [YOUR_WORD]' ))

This is not an officially supported Google product

dataflow-contact-center-speech-analysis's People

Contributors

ruiscosta avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.