ibmstreams / streamsx.sttgateway Goto Github PK

View Code? Open in Web Editor NEW

0.0 12.0 1.0 20.09 MB

This toolkit does Speech To Text transcription using an external provider such as the IBM Watson STT cloud service.

Home Page: https://ibmstreams.github.io/streamsx.sttgateway/

License: Apache License 2.0

C++ 83.63% Java 4.38% Shell 11.89% HTML 0.10%

stream-processing ibm-streams speech-to-text stt watson-speech-to-text ibm-cloud ibm-cloud-private toolkit

streamsx.sttgateway's Introduction

STT Gateway toolkit for IBM Streams

Purpose

This toolkit is designed to ingest audio data either stored in files (.wav, .mp3 etc. for a batch workload) or streamed through a telephony infrastructure (for a real-time workload). It then transcribes that audio into text via the IBM Watson STT (Speech To Text) service running on the IBM public cloud or on the IBM Cloud Pak for Data (CP4D i.e. private cloud).

It provides the following two operators to realize that purpose.

IBMVoiceGatewaySource is a source operator that can be used to ingest speech data from the IBM Voice Gateway product v1.0.3.0 or higher. Such speech data comes from multiple live telephone conversations happening between different pairs of speakers e-g: customers and call center agents.

WatsonSTT is an analytic operator that can be used to transcribe speech data into text either in real-time or in batch mode.

Architectural patterns enabled by this toolkit

For the real-time speech to text transcription, following are the possible architectural patterns.

Your Telephony SIPREC-->IBM Voice Gateway-->IBM Streams<-->Watson Speech To Text on IBM Public Cloud
Your Telephony SIPREC-->IBM Voice Gateway-->IBM Streams<-->Watson Speech To Text on IBM Cloud Pak for Data (CP4D)
Your Telephony SIPREC-->IBM Voice Gateway-->IBM Streams<-->Watson Speech To Text engine embedded inside an IBM Streams operator

For the batch (post call) speech to text transcription, following are the possible architectural patterns.

Speech data files in a directory-->IBM Streams<-->Watson Speech To Text on IBM Public Cloud
Speech data files in a directory-->IBM Streams<-->Watson Speech To Text on IBM Cloud Pak for Data (CP4D)
Speech data files in a directory-->IBM Streams<-->Watson Speech To Text engine embedded inside an IBM Streams operator

All-in-one Speech to text analytics, Call Recording and Call Replay

As described above, Speech To Text is the core feature of this toolkit. In addition, this toolkit enables call recording and call replay. It includes two real-world tested examples that show how to do live voice call recording and call replay from the pre-recorded calls. Many other vendors provide proprietary, rigid black-box solutions for call recording at a hefty price tag with either a non-existent or a minimal call replay facility. But, this toolkit gives those two features for free in a completely open and a flexible manner for users to beneift from them. Such a benefit allows customers to control where the recorded data gets stored in a standard Mu-Law format as well as accessing and using that data for their other purposes. All of them combined, it is a compelling way in which the IBM Voice Gateway, IBM Streams and IBM Watson Speech To Text offerings put the customer in the driver's seat to gather real-time intelligence from their voice infrastructure.

A visual description of this toolkit's architecture

Documentation

The official toolkit documentation with extensive details is available at this URL: https://ibmstreams.github.io/streamsx.sttgateway/
A file named sttgateway-tech-brief.txt available at this tooolkit's top-level directory also provides a good amount of information about what this toolkit does, how it can be built and how it can be used in the IBM Streams applications.
The official documentation for the IBM Voice Gateway product is available here
The official documentation for the IBM Watson Speech To Text service is available here

Requirements

There are certain important requirements that need to be satisfied in order to use the IBM Streams STT Gateway toolkit in Streams applications. Such requirements are explained below.

Note: This toolkit is not supported on Red Hat Enterprise Linux Workstation release 6.x

Note: This toolkit requires c++11 support.

Network connectivity to the IBM Watson Speech To Text (STT) service running either on the public cloud or on the Cloud Pak for Data (CP4D) is needed from the IBM Streams Linux machines where this toolkit will be used. The same is true to integrate with the IBM Voice Gateway product for the use cases involving speech data ingestion for live voice calls.
This toolkit uses Websocket to communicate with the IBM Voice Gateway and the Watson STT service. A valid IAM access token is needed to use the Watson STT service on the public cloud and a valid access token to use the Watson STT service on the CP4D. So, users of this toolkit must provide their public cloud STT service instance's API key or the CP4D STT service instance's access token when launching the Streams application(s) that will have a dependency on this toolkit. When using the API key from the public cloud, a utility SPL composite named IAMAccessTokenGenerator available in this toolkit will be able to generate the IAM access token and then subsequently refresh that token to keep it valid. A Streams application employing this toolkit can make use of that utility composite to generate the necessary IAM access token needed in the public cloud. Please do more reading about the IAM access token from here.
On the IBM Streams application development machine(s) (where the application code is compiled to create the application bundle), it is necessary to download and install the toolkit release bundle. The toolkit release bundle contains the necessary ant build script to download the required external libraries: boost, websocketpp and rapidjson. For the essential steps to meet this requirement, please refer to the above-mentioned documentation URL or a file named sttgateway-tech-brief.txt available at this tooolkit's top-level directory.
On the IBM Streams application development machine(s) the following toolkits are required:

com.ibm.streamsx.inet version 2.3.6 or higher
com.ibm.streamsx.json version 1.4.6 or higher
com.ibm.streamsx.websocket version 1.0.6 or higher

On the IBM Streams application machines, please ensure that the openssl and libcurl are installed including the openssl-devel and libcurl-devel. This is required by the toolkit dependency to streamsx.websocket and the streamsx.inet toolkits. This is required by this toolkit to generate and refresh and refresh the IAM access token which is a must for the STT service on public cloud as well as for the TLS support.
For the IBM Streams and the IBM Voice Gateway products to work together, certain configuration steps must be done in both the products. For more details on that, please refer to this toolkit's documentation URL or the sttgateway-tech-brief.txt available at this tooolkit's top-level directory.

External libraries used

boost 1.73.0
websocketpp 0.8.2
rapidjson 1.1.0

Example usage of this toolkit inside a Streams application

Here is a code snippet that shows how to invoke the WatsonSTT operator available in this toolkit with a subset of supported features:

use com.ibm.streamsx.sttgateway.watson::*;

/*
Invoke one or more instances of the WatsonSTT operator.
You can send the audio data to this operator all at once or 
you can send the audio data for the live-use case as it becomes
available from your telephony infrastructure.
Avoid feeding audio data coming from more than one data source into this 
parallel region which may cause erroneous transcription results.

NOTE: The WatsonSTT operator allows fusing multiple instances of
this operator into a single PE. This will help in reducing the 
total number of CPU cores used in running the application.
It is better to fuse only when there are upto a maximum of 
ten WatsonSTT operator instances. Anything more than that, it is 
better not to fuse them in order for the application logic to
work correctly.
*/
@parallel(width = $numberOfSTTEngines, 
partitionBy=[{port=ABC, attributes=[conversationId]}], broadcast=[AT])
(stream<STTResult_t> STTResult) as STT = WatsonSTT(AudioBlobContent as ABC; IamAccessToken, AccessTokenForCP4D as AT) {
   param
      uri: $sttUri;
      baseLanguageModel: $sttBaseLanguageModel;
			
   output
      STTResult: conversationId = conversationId, 
                 utteranceNumber = getUtteranceNumber(),
                 utteranceText = getUtteranceText(),
                 utteranceStartTime = getUtteranceStartTime(),
                 utteranceEndTime = getUtteranceEndTime(),
                 finalizedUtterance = isFinalizedUtterance(),
                 transcriptionCompleted = isTranscriptionCompleted(),
                 sttErrorMessage = getSTTErrorMessage();
}

A built-in example inside this toolkit can be compiled and launched with the default STT options to use the STT service on public cloud as shown below. The sample AudioFileWatsonSTT required that the stt service connection details are provided as application configuration properties. To create the application configuration, you can use the following command.

streamtool mkappconfig --description 'connection configuration for IBM Cloud Watson stt service' \
	--property 'apiKey=<your api key>' \
	--property 'iamTokenURL=https://iam.cloud.ibm.com/identity/token' \
	--property 'url=<your stt instance uri>' \
	sttConnection

cd   streamsx.sttgateway/samples/AudioFileWatsonSTT
make
st  submitjob  -d  <YOUR_STREAMS_DOMAIN>  -i  <YOUR_STREAMS_INSTANCE>  output/com.ibm.streamsx.sttgateway.sample.watsonstt.AudioFileWatsonSTT.sab

Following IBM Streams job sumission command shows how to override the default values with your own as needed for the various STT options:

cd   streamsx.sttgateway/samples/AudioRawWatsonSTT
make
st submitjob  -d  <YOUR_STREAMS_DOMAIN>  -i  <YOUR_STREAMS_INSTANCE>  output/com.ibm.streamsx.sttgateway.sample.watsonstt.AudioRawWatsonSTT.sab -P  sttApiKey=<YOUR_WATSON_STT_SERVICE_API_KEY>  -P sttBaseLanguageModel=en-US_NarrowbandModel  -P contentType="audio/wav"    -P filterProfanity=true   -P keywordsSpottingThreshold=0.294   -P keywordsToBeSpotted="['country', 'learning', 'IBM', 'model']"   -P smartFormattingNeeded=true   -P wordAlternativesThreshold=0.251   -P maxUtteranceAlternatives=5   -P audioBlobFragmentSize=32768   -P sttLiveMetricsUpdateNeeded=true  -P audioDir=<YOUR_AUDIO_FILES_DIRECTORY>   -P numberOfSTTEngines=50

Following is another way to run the same application to access the STT service on the IBM Cloud Pak for Data (CP4D). STT URI shown below is for an illustrative purpose and you must use a valid STT URI obtained from your CP4D cluster.

st  submitjob  -d  <YOUR_STREAMS_DOMAIN>  -i  <YOUR_STREAMS_INSTANCE>  output/com.ibm.streamsx.sttgateway.sample.watsonstt.AudioFileWatsonSTT.sab  -P  sttOnCP4DAccessToken=<YOUR_CP4D_STT_SERVICE_ACCESS_TOKEN>  -P  sttUri=wss://b0610b07:31843/speech-to-text/ibm-wc/instances/1567608964/api/v1/recognize

If you are planning to ingest the speech data from live voice calls, then you can invoke the IBMVoiceGatewaySource operator as shown below.

(stream<BinarySpeech_t> BinarySpeechData as BSD) as VoiceGatewayInferface = 
 IBMVoiceGatewaySource() {
    logic
       state: {
          // Initialize the default TLS certificate file name if the 
          // user didn't provide his or her own.
          rstring _certificateFileName = 
             ($certificateFileName != "") ?
              $certificateFileName : getThisToolkitDir() + "/etc/ws-server.pem";
       }
				
       param
          tlsPort: $tlsPort;
          certificateFileName: _certificateFileName;
          initDelay: $initDelayBeforeSendingDataToSttEngines;
			
       // Get these values via custom output functions provided by this operator.
       output
          BSD: vgwSessionId = getIBMVoiceGatewaySessionId(),
          callStartDateTime = getCallStartDateTime(), 
          isCustomerSpeechData = isCustomerSpeechData(),
          vgwVoiceChannelNumber = getVoiceChannelNumber(),
          callerPhoneNumber = getCallerPhoneNumber(),
          agentPhoneNumber = getAgentPhoneNumber(),
          speechDataFragmentCnt = getTupleCnt(),
          totalSpeechDataBytesReceived = getTotalSpeechDataBytesReceived();
}

In addition to the code snippet shown above to invoke the IBMVoiceGatewaySource operator, one must do additional logic to allocate a dedicated WatsonSTT operator instance for each voice channel in a given call. A demo application is available for this toolkit has that logic which can be reused in any other application. That particular example can be compiled and launched to ingest speech data from the IBM Voice Gateway for seven concurrent voice calls and send it to the WatsonSTT operator running with most of the default STT options to use the STT service on public cloud as shown below.

cd   streamsx.sttgateway/samples/VoiceGatewayToStreamsToWatsonSTT
make
st  submitjob  -d  <YOUR_STREAMS_DOMAIN>  -i  <YOUR_STREAMS_INSTANCE>  output/com.ibm.streamsx.sttgateway.sample.watsonstt.VoiceGatewayToStreamsToWatsonSTT.sab -P tlsPort=9443  -P numberOfSTTEngines=14  -P sttApiKey=<YOUR_WATSON_STT_SERVICE_API_KEY>  -P contentType="audio/mulaw;rate=8000"

Special Note For those customers who are using the speech to text engine embedded in the com.ibm.streams.speech2text.watson::WatsonS2T operator, the following example is available as a reference application to exploit that operator in a real-time voice call analytics scenario. It can be compiled and executed as shown below. You have to replace the hardcoded paths and IP addresses to suit your environment.

cd   streamsx.sttgateway/samples/VoiceGatewayToStreamsToWatsonS2T
make
st submitjob -P tlsPort=9443 -P vgwSessionLoggingNeeded=false -P numberOfS2TEngines=80 -P WatsonS2TConfigFile=/home/streamsadmin/toolkit.speech2text-v2.12.0/model/en_US.8kHz.general.diarization.low_latency.pset -P WatsonS2TModelFile=$HOME/toolkit.speech2text-v2.12.0/model/en_US.8kHz.general.pkg -P ipv6Available=false -P writeTranscriptionResultsToFiles=true -P sendTranscriptionResultsToHttpEndpoint=true -P httpEndpointForSendingTranscriptionResults=http://172.30.105.11:9080 -P callRecordingWriteDirectory=/homes/hny5/sen/call-recording-write -P callRecordingReadDirectory=/homes/hny5/sen/call-recording-read -P numberOfCallReplayEngines=15 -C fusionScheme=legacy  output/com.ibm.streamsx.sttgateway.sample.watsons2t.VoiceGatewayToStreamsToWatsonS2T.sab

Examples that showcase this toolkit's features

There are many examples available in this toolkit that can be compiled and tested. Couple of them are generic real-world solutions running in production that can be customized and used when needed.

If you have no need for the call recording and call replay features, you can use the two examples below that end with the word Mini. It will cut down the extra logic to result in a fewer number of overall operators.

WHATS NEW

see: CHANGELOG.md

streamsx.sttgateway's People

Contributors

Watchers

Forkers

markheger

streamsx.sttgateway's Issues

Potential Ressource Problem

The WatsonSTT operator queues incomming tuples independent of the operational state of the stt service. Thus when the STT service is not operational, this queue may grow over all limits.
I propose that the operators throws away the oldest input tuples, when a certain limit is reached.

Re-Connect handling of WatsonSTT operator

When the number of unsuccessful reaches the limit, the operator does not make any further attempts to connect to the stt service. This seems not to be very usefull.
For instance if there is an temporary network issue and the limit is reached, the operators will not make a re-connection.
I would propose, that the operator :

Still makes re-connection attemps, when the max. unsuccessful connection attempts limit is reached.
Slows down the the frequency of re-connection attempts, when the max. unsuccessful connection attempts limit is reached. (e.g. 1 attempt per minute)

Remove Compiler Warning in WatsonSTT Operator

With new boost library we see new warnings:

note: #pragma message: The practice of declaring the Bind placeholders (_1, _2, ...) in the global namespace is deprecated. Please use <boost/bind/bind.hpp> + using namespace boost::placeholders, or define BOOST_BIND_GLOBAL_PLACEHOLDERS to retain the current behavior.

Avoid boost cmake files in lib directory

Since boost version 1.70 there is a directory 'cmake' generated in the lib directory. This should not be there because this is not a library which is required during runtime. I don't know how to get rid of this dir

Resource control in Watson STT Operator

When the stt service is not available or slow or if a wrong access token is specified the input queues of tthe WatsonSTT operator grow without a limit?

WatsonSTT throws during shutdown

During shutdown coredumps are seen:
JVMDUMP039I Processing dump event "abort", detail "" at 2019/12/03 17:25:35 - please wait.
JVMDUMP032I JVM requested System dump using '/home/joergboe/.streams/var/Streams.sab_DLZ-StreamsDomain-StreamsInstance/c1d1a4db-aa0d-44c9-861c-3d66219bd9c9/currentWorkingDir/1/core.20191203.172535.4360.0001.dmp' in response to an event
JVMDUMP010I System dump written to /home/joergboe/.streams/var/Streams.sab_DLZ-StreamsDomain-StreamsInstance/c1d1a4db-aa0d-44c9-861c-3d66219bd9c9/currentWorkingDir/1/core.20191203.172535.4360.0001.dmp
JVMDUMP032I JVM requested Java dump using '/home/joergboe/.streams/var/Streams.sab_DLZ-StreamsDomain-StreamsInstance/c1d1a4db-aa0d-44c9-861c-3d66219bd9c9/currentWorkingDir/1/javacore.20191203.172535.4360.0002.txt' in response to an event
JVMDUMP010I Java dump written to /home/joergboe/.streams/var/Streams.sab_DLZ-StreamsDomain-StreamsInstance/c1d1a4db-aa0d-44c9-861c-3d66219bd9c9/currentWorkingDir/1/javacore.20191203.172535.4360.0002.txt
JVMDUMP032I JVM requested Snap dump using '/home/joergboe/.streams/var/Streams.sab_DLZ-StreamsDomain-StreamsInstance/c1d1a4db-aa0d-44c9-861c-3d66219bd9c9/currentWorkingDir/1/Snap.20191203.172535.4360.0003.trc' in response to an event
JVMDUMP010I Snap dump written to /home/joergboe/.streams/var/Streams.sab_DLZ-StreamsDomain-StreamsInstance/c1d1a4db-aa0d-44c9-861c-3d66219bd9c9/currentWorkingDir/1/Snap.20191203.172535.4360.0003.trc
JVMDUMP007I JVM Requesting JIT dump using '/home/joergboe/.streams/var/Streams.sab_DLZ-StreamsDomain-StreamsInstance/c1d1a4db-aa0d-44c9-861c-3d66219bd9c9/currentWorkingDir/1/jitdump.20191203.172535.4360.0004.dmp'
JVMDUMP010I JIT dump written to /home/joergboe/.streams/var/Streams.sab_DLZ-StreamsDomain-StreamsInstance/c1d1a4db-aa0d-44c9-861c-3d66219bd9c9/currentWorkingDir/1/jitdump.20191203.172535.4360.0004.dmp

The prepareToShutdown method calls wsClient->close unchecked but wsClient may be null.

Some of the Parameters of Operator WatsonSTT have no implementation

In Operator WatsonSTT the parameters contentType, baseModelVersion and customizationId have no implementation in the operator code. This should be corrected.

Remove Compiler Warning in include/boost/bind.hpp

With boost 1.73. we see the compiler warning in an include file boost/bind.hpp:

The base problem will be resoved in nex boost release see:
boostorg/property_tree#51

As work-around we use the #define BOOST_BIND_GLOBAL_PLACEHOLDERS to supress this message

Samples main composite should be public

To provide a common spldoc for the samples, the main composite should be public. (Otherwise the spl-make-doc produces inconsistent output)

Reduce the number of parameters in WatsonSTT operator

The WatsonSTT operator has a lot of parameters and functions.
I think some parameters are redundant:

wordTimestampNeeded: This parameter depends on functions getUtteranceWords(), getUtteranceWordsStartTimes() and getUtteranceWordsEndTimes(). If one of these functions is requested, the parameter must be true. Otherwise it must be false.
wordConfidenceNeeded: This parameter depends on getUtteranceWordsConfidences()
identifySpeakers: This parameter depends on getUtteranceWordsSpeakers() and getUtteranceWordsSpeakersConfidences()

Improve description of operator WatsonSTT parameter sttResultMode

The description of this parameter should explicitly mention that result mode 3 disables

maxUtteranceAlternatives,
wordAlternativesThreshold,
wordConfidenceNeeded,
wordTimestampNeeded,
identifySpeakers,
keywordsSpottingThreshold

Default Branch?

I think the default branch should be the master-branch.
All Pull-Requests should finally go into the master branch and not into the develop branch.

Add build.xml file

We must add a build.xml file with targets to enable an automated production of the toolkit.
The build.xml file should also contain targets for automatic download and build of the required libraries and targets to produce an release archive.

The Eclipse CDT compiler settings are sporadically removed

The file :
com.ibm.streamsx.sttgateway/.settings/language.settings.xml

stores settings which are important for the Studio internal c++ indexer.
This file can not be stored, because it changes with different Studio versions.

When a release file is made (target 'ant release') this file is deleted. This should not happen. Reason:
In build file build.xml, the targets 'warn-untracked' and 'check-untracked' must use identical conditions, otherwise some ignored files may be deleted unintentionally.

Library dependencies

All toolkit artifacts should define the library dependencies in the xml model file.
The compilation of the samples should not require additional include directives for gcc except the c11 directive.

Operator WatsonSTT: Output should use default outputs

The current implementation requires that the user writes up to 20 output functions.

The operator should assign these ouptput functions automatically to output stream attributes when the output stream has the attribute with a certain name and a fitting type. e. g.:

The output stream has attribute utteranceText of type rstring, the output function getUtteranceText() should be assigned to the this attribute.
The COF assignments of the operator invocation take precedence over the defaults.

Clarify interaction of Custom Output Functions and Parameter STT Result Mode

In mode sttResultMode=1

getFullTranscriptionText is always empty
getConfidence is 0 except for the finalizedUtterance=true
getUtteranceStartTime and getUtteranceEndTime are correct
utteranceAlternatives is valid only for finalizedUtterance=true
utteranceWordsConfidences is available only when finalizedUtterance=true
utteranceWordsStartTimes and utteranceWordsEndTimes are correct
utteranceWordsSpeakers and utteranceWordsSpeakersConfidences are available for finalizedUtterance=true

In mode sttResultMode=2

getFullTranscriptionText is always empty
getConfidence is correct
utteranceAlternatives is ok
utteranceWordsStartTimes values are multiple time in the list
utteranceWordsEndTimes are ok
utteranceWordsSpeakers and utteranceWordsSpeakersConfidences are ok

In mode sttResultMode=3

utteranceText is always empty
finalizedUtterance is always false
confidence is the value of the last utterance and not from the whole transcription
transcriptionCompleted is redundant
utteranceStartTime=0,utteranceEndTime=0
all other values are empty

Must fix

Avoid dependency to external commands

The IAMAccessTokenGenerator uses the curl command line tool to get/refresh the IAM access token. This imposes an dependency to an external command which is unusual in most of the toolkits.
The HTTPRequest operator of the inet toolkit has the same functionality and is delivered with the Streams product.
Thus the usage of the HTTPRequest operator results in well-fomed code and avoids external dependencies.

Sporadicaly Transcription Fails

Sometimes the transcription of a file fails. The analysy shows:
It is a 'glare case': The stt-service shuts down the connection after 30sec. silence after a successful transcription
and
the WatsonSTT operator initializes a new conversation just in this moment. In this case the transcripion is lost:

The stt service is initialized with parameter: "inactivity_timeout": -1
but this statement does not have an effect.

Trace:
Here the prevoius transcription seems to end:
13 Mar 2020 09:14:42.260+0100 [18664] WARN #splapptrc,J[0],P[0],STTResultStream[0],ws_receiver M[WatsonSTTImplReceiver.hpp:on_message:816] - Operator STTResultStream[0]-->Channel 0-->RE39 ignore speaker info because no oTuple with utterances is available. payload_:

Here the new file tuple comes in, immediately followed by a punct and the nex file tuple
13 Mar 2020 09:15:15.696+0100 [18659] INFO #splapptrc,J[0],P[0],STTResultStream[0],ws_sender M[WatsonSTTImpl.hpp:process_0:402] - Operator STTResultStream[0]-->Channel 0-->PR0 Start a new conversation number 2

The next file tuple comes in and is blocked becaus of conversation 2 is not yet completed:
13 Mar 2020 09:15:15.717+0100 [18659] INFO #splapptrc,J[0],P[0],STTResultStream[0],ws_sender M[WatsonSTTImpl.hpp:process_0:402] - Operator STTResultStream[0]-->Channel 0-->PR0 Start a new conversation number 3

Conversation 2 fails:
13 Mar 2020 09:15:17.259+0100 [18664] ERROR #splapptrc,J[0],P[0],STTResultStream[0],ws_receiver M[WatsonSTTImplReceiver.hpp:on_message:607] - Operator STTResultStream[0]-->Channel 0-->RE25 STT error message=Session timed out.
13 Mar 2020 09:15:17.259+0100 [18664] WARN #splapptrc,J[0],P[0],STTResultStream[0],ws_receiver M[WatsonSTTImplReceiver.hpp:on_message:620] - Operator STTResultStream[0]-->Channel 0-->RE27 append error attribute and send error tuple
13 Mar 2020 09:15:17.348+0100 [18664] ERROR #splapptrc,J[0],P[0],STTResultStream[0],ws_receiver M[WatsonSTTImplReceiver.hpp:on_close:847] - Operator STTResultStream[0]-->Channel 0-->RE82 Fully established Websocket connection closed with the Watson STT service. wsState=error
13 Mar 2020 09:15:17.348+0100 [18664] INFO #splapptrc,J[0],P[0],STTResultStream[0],ws_receiver M[WatsonSTTImplReceiver.hpp:ws_init:362] - Operator STTResultStream[0]-->Channel 0-->RE10 (after run)

Conversation 3 is started with a connection attempt from closed state
13 Mar 2020 09:15:17.741+0100 [18659] INFO #splapptrc,J[0],P[0],STTResultStream[0],ws_sender M[WatsonSTTImpl.hpp:connect:598] - Operator STTResultStream[0]-->Channel 0-->CS5 Make a connection attempt number 1 from wsState=closed

Library update

The toolkit should make a update of the used libraries:

boost 1.69.0 -> 1.73.0
websocketpp 0.8.1 -> 0.8.2

Clarify supported paltforms

The documentation should clearly state the supported platforms of the toolkit.
Especially the toolkit is not supported on rhel6 because of the lack of c++11 support of the standard compiler.

Output Speaker Label Updates

Currently the operator outputs a speaker label list that fits to the utterance word list of the current utterance. But the Spaker Label Updates are also delivered from stt service. But currently there is no output function available to output these Speaker Label Updates.

Test updates

Some tests use an not existing function:

setError

use the correct function:

setFailure

Use a longer test case timeout.

change name of the default branch

as per guidelines the "main" branch shall no longer have the name "master"

set develop as default branch
delete master

Implement Globalization

Operator WatsonSTT make error logs more verbose in case of connection failure

In case of a connection error the operator should include the http response line and code in the error log

Kind of metrics in WatsonSTT operator

I wonder about the kind of metrics in WatsonSTT operator:

nFullAudioConversationsReceived: this metric starts with 0 an grows continuously - so it should be of type counter
nFullAudioConversationsTranscribed: this metric starts with 0 an grows continuously - so it should be of type counter
nSTTResultMode: this metric starts from the initial value and is never ment to change - so I think the nature is gauge

Warning during generation of spldoc-samples

The command
ant spldoc-samples
emits a lot of warning when generating doc for sample VoiceGatewayToStreamsToWatsonSTT and VoiceGatewayToStreamsToWatsonS2T.
This was not the case in the previous release!

Add toolkit standard theme to gh-pages

We want to use a common theme for the toolkits, so this theme has to be added to the gh-pages here, too.

Correct tuple output logic

There should be no output tuples if non final utterances are received and parameter nonFinalUtterancesNeeded is false.

Use Streams Operator threads in Operator WatsonSTT

The operator should use the operator threads, provided by the Streams.
These operators are under control of the Streams Runtime and it removes the dpendency from the boost-tread lib.

Use useful default paramters for watson stt

The Watson STT operator should have useful default parameters for:

maxUtteranceAlternatives
wordAlternativesThreshold
keywordsSpottingThreshold

so that if the user has activated the appropriate output function, the function is active without the need of furter configuration.

Tests are broken

INFO: **** START Case case='SamplesCompile' variant='VoiceGatewayToStreamsToWatsonSTT' in workdir /home/tktest01/workspace/toolkit.streamsx.sttgateway/tests/onprem/workdir/20200722-121809/StreamsxSttgateway/SamplesCompile/VoiceGatewayToStreamsToWatsonSTT pid 4047 START ********************12:18:10 316902606 INFO: Run category pattern set match found: quick == quick
[exec] 12:18:10 322114571 INFO: 0 Case Preparation steps executed
[exec] 12:18:10 326127124 INFO: Execute Case Test Step: testStep
[exec] /home/tktest01/streams/4.3.1.0/samples/com.ibm.streamsx.sttgateway/VoiceGatewayToStreamsToWatsonSTT
[exec] 12:18:10 329963623 INFO: testStep -> echoExecuteInterceptAndSuccess: make all
[exec] build use env settings
[exec] /home/tktest01/streams/4.3.1.0/bin/sc -a --c++std=c++11 -x '-I /home/tktest01/streams/4.3.1.0/toolkits/com.ibm.streamsx.sttgateway/impl/include' -M com.ibm.streamsx.sttgateway.sample.watsonstt::VoiceGatewayToStreamsToWatsonSTT -t /home/tktest01/streams/4.3.1.0/toolkits/com.ibm.streamsx.sttgateway:/home/tktest01/streams/4.3.1.0/toolkits/com.ibm.streamsx.json:/home/tktest01/streams/4.3.1.0/toolkits/com.ibm.streamsx.inet:/home/tktest01/com.ibm.streamsx.websocket --data-dir data --output-dir output -C -j 1
[exec] rm -rf output
[exec] /home/tktest01/streams/4.3.1.0/bin/sc -a --c++std=c++11 -x '-I /home/tktest01/streams/4.3.1.0/toolkits/com.ibm.streamsx.sttgateway/impl/include' -M com.ibm.streamsx.sttgateway.sample.watsonstt::VoiceGatewayToStreamsToWatsonSTT -t /home/tktest01/streams/4.3.1.0/toolkits/com.ibm.streamsx.sttgateway:/home/tktest01/streams/4.3.1.0/toolkits/com.ibm.streamsx.json:/home/tktest01/streams/4.3.1.0/toolkits/com.ibm.streamsx.inet:/home/tktest01/com.ibm.streamsx.websocket --data-dir data --output-dir output -j 1
[exec] CDISP0727W WARNING: The /home/tktest01/com.ibm.streamsx.websocket input path is not a directory.
[exec] CDISP0385E ERROR: The VoiceGatewayToStreamsToWatsonSTT toolkit requires version [1.0.6,7.0.0] of the com.ibm.streamsx.websocket toolkit, but version [1.0.6,7.0.0] of the com.ibm.streamsx.websocket toolkit is not available.
[exec] CDISP0131E ERROR: Errors occurred while the toolkits were loading.
[exec] make: *** [distributed] Error 1
[exec] [31m12:18:12 647560780 ERROR: setFailure : 2 : returned from make[0m
[exec] 12:18:12 653203210 INFO: 2 : returned from make
[exec] 12:18:12 666166526 INFO: 1 Case Test Step steps executed
[exec] 12:18:12 767555813 INFO: 0 Case Finalization steps executed
[exec] 12:18:12 809570984 INFO: kill 4427 :
[exec] [31m12:18:12 821576684 ERROR: **** FAILURE : 2 : returned from make ****[0m
[exec] 12:18:12 824016442 INFO: **** END Case case='SamplesCompile' variant='VoiceGatewayToStreamsToWatsonSTT' FAILURE ********************
[exec] 12:18:12 828276693 INFO: **** Elapsed time : 00:00:02 state=finalization *****
[exec] 12:18:12 830960290 INFO: caseExitFunction
[exec] Result: FAILURE

Operator WatsonSTT should use common practice to detect the end of a Stream

The common practice in streams to detect the end of a window. The end of the window is in case of the WatsonSTT operator, the end of the conversation. The operator should rely on a window punctuation marker and not on empty blobs.

WatsonSTT operator needs a functional logging

The stt operator loggs a lot events as errors which are no errors. E.g.:

Operator STTResult[0]-->Channel 0-->Received new/refreshed access token
Channel 0-->Websocket connection attempt 1 to the Watson STT service.

We should use error logs only for critical events and not for normal events.

WatsonSTT operator multiple utteranceWordsStartTimes

In sttResultMode 2 the stt operator repeats the utteranceWordsStartTimes multiple times in the utteranceWordsStartTimes output list.

Samples should present results at console

Samples should present results at console and not as an error-log. This confuses the user to find the result of the sample between all the unnecessary error logs. see also #11