This is a deprecated Watson Text to Speech Service Demo. A link to the newly supported demo is below

Home Page: https://www.ibm.com/demos/live/tts-demo/self-service/

License: Apache License 2.0

JavaScript 96.16% CSS 3.69% Shell 0.14%

text-to-speech-nodejs's Introduction

# DEPRECATED

This demo and repo is no longer supported. You can find the newly supported Text to Speech Demo here.

🔊 Text to Speech Demo

Node.js sample applications that shows some of the the IBM Watson Text to Speech service features.

Text to Speech is designed for streaming, low latency, synthesis of audio from text. It is the inverse of the automatic speech recognition.

You can view a demo of this app.

Prerequisites

Sign up for an IBM Cloud account.
Download the IBM Cloud CLI.
Create an instance of the Text to Speech service and get your credentials:
- Go to the Text to Speech page in the IBM Cloud Catalog.
- Log in to your IBM Cloud account.
- Click Create.
- Click Show to view the service credentials.
- Copy the apikey value.
- Copy the url value.

Configuring the application

In the application folder, copy the .env.example file and create a file called .env
```
cp .env.example .env
```
Open the .env file and add the service credentials that you obtained in the previous step.

Example .env file that configures the apikey and url for a Text to Speech service instance hosted in the US East region:
```
TEXT_TO_SPEECH_IAM_APIKEY=X4rbi8vwZmKpXfowaS3GAsA7vdy17Qh7km5D6EzKLHL2
TEXT_TO_SPEECH_URL=https://gateway-wdc.watsonplatform.net/text-to-speech/api
```

Running locally

Install the dependencies
```
npm install
```
Run the application
```
npm start
```
View the application in a browser at localhost:3000

Deploying to IBM Cloud as a Cloud Foundry Application

Login to IBM Cloud with the IBM Cloud CLI
```
ibmcloud login
```
Target a Cloud Foundry organization and space.
```
ibmcloud target --cf
```
Edit the manifest.yml file. Change the name field to something unique. For example, - name: my-app-name.
Deploy the application
```
ibmcloud app push
```
View the application online at the app URL, for example: https://my-app-name.mybluemix.net

Directory structure

.
├── app.js                      // express routes
├── config                      // express configuration
│   ├── error-handler.js
│   ├── express.js
│   └── security.js
├── manifest.yml
├── package.json
├── public                      // static resources
├── server.js                   // entry point
├── test                        // tests
└── views                       // react components

License

This sample code is licensed under Apache 2.0.

Contributing

See CONTRIBUTING.

Open Source @ IBM

Find more open source projects on the IBM Github Page

text-to-speech-nodejs's People

Contributors

Stargazers

Watchers

Forkers

aameek kesuskim dstomar waieez easonzhou naim08 tanjunyen minicast esbullington citymorph akshayx11 nolawee ashishth09 vovietanh c5f melickm jrwashbu sfd13560096 dou800 hmagph sktay daniel-bolanos gamegoofs2 leandrodvd alessiotest idsdev kasaby ym0129 anhmike pritambankar pcuci omarsamer liangnet rfdickerson simdeveloper moleather40 cloudstdio rivenx hovosgithub sushruth528 ejaazshaik mauricedw22 mamandal shivam11 gandooke kvragav cmammado spencerthebunny codeteamer dougss10 prabhjotsl josepmao andrewbresee anthonysalierno gauthamacharya kwarodom phpmind takashi-tomita ashbeats cognitivetouch lachee samisabik areddy7021 slashie raphaklaus wbeater kn9 ganhon-jp montagao kruthivillivalam edenlyriene claudiotorresm kennethwhiteibm dongweibox kevinsegal tngamemo dinhbinh1610 mehdizaafrane hyllesen cyzhdl lfollansbee colbycheeze claytonf salemh aclm vinaykarode ipelengbela dallin-r-parker andrewwiik phucdh57 mrutyunjaya48 liugreatsea najimovi martivic tnakajo anadil bigmuflar amfang modulexcite trinamolnar

text-to-speech-nodejs's Issues

Prevent users from using two charset with only one voice

Users can only use the charset associated with the voice.
e.g. If they use spanish they can only use spanish characters. We need to add validation on the client side to prevents api calls with invalid characters

Incorrect "please enter the text" message when using SSML

When the "Text" tab contains no text and the SSML" tab is selected, the app incorrectly displays the error "Please enter the text you would like to synthesize in the text window." and does not submit the request.

No error message when incorrect SSML input value is specified

Specifying an incorrect SSML input value and downloading the audio file seems to break it.
(I get an error message as expected when I click on 'Speak' instead).

Examples:

developer experience checklist

make sure it scales, it could be in the front page of reddit
- Mainly looking at page weight: Shoot for 2-3mb max and rendering in 5 sec or less.
- Test with dev tools throttling to 3G speeds and make sure things are still reasonable.
Add google analytics
blue-green deployment + travis (see this)
testing + travis (see this)
security.js (helmet + express-rate-limitation) + ~~~CSRF (see personality-insights and speech demos)~~~
- Skipping CSRF protection here because it doesn't work on GET, and we have to use GET to stream to <audio> elements. (It honestly doesn't make a lot of sense in general for unauthenticated apps... but we're employing it for the side-effect of making scraping harder. But, again, it doesn't work on GET requests.)
package.json should not specify node-engine so that Bluemix will always use the latest one.
Google RE-Captcha support (make sure design take this into account when designing a demo)
- Talk to design to add it for existing demos.
Update travis to send emails when there is a tag release.
Bluemix deployment tracker and privacy notice

First syllable is lost in created audio after it's downloaded

Issue: First syllable is lost in created audio after it's downloaded, for all en-US English voices, and some other voices.

Example text:
tomato, tomato, tomato.

Allision voice and most languages
After pressing, "Download" and listening ...
Expected: tomato, tomato, tomato
Actual: mato, tomato, tomato (note: first syllable of first word is missing)

Alexandra

Service failing using a dedicated Bluemix account.

I am getting a challenge to enter ID information to pass through a Watson gateway when trying to use the demo application. In Bluemix Public, I am able to provide my user information and use the service; however, when I try to access the service using a Bluemix dedicated account, my credentials do not work. Assistance would be most appreciated.

Code 500 error

{"code":500,"error":"Block-scoped declarations (let, const, function, class) not yet supported outside strict mode"}

How can I fixed this? its about NodeJs version?

Redeploy the demo

Hi, German&James, I've updated the navigation link, service icon and favicon. Please help redeploy the demo :-)

es-US male?

http://181.135.63.86:3003/

Is possible have male for spanish es-US?

Drop down menu

need to change the order in the drop down box for the voices. Should be Language (Country code): Name of voice? So for example "English (US): Michael". This way it allows for easier sorting by language.

Language translator V2 or V3 ?

Hello , Is this demo already using the Language Translation V3 service ?
Or is it still on V2 ?
I cannot find an entry for TTS V3 in the current Watson SDK .
Is there a a way to bypass that ?
Thank you

Copyright on output?

Thanks for providing!

One quick question: Is the output licensed in a certain way or is it open source as well?

TTS demo does not seem to be streaming

I'm not sure, but from the latency it looks like it is not streaming. I thought this was a recent change by Eric, but maybe not in the published version yet?

failed to with "deploy to Bluemix"

When click "deploy to bluemix" button, succeed in

create project
clone repository
config pipeline
but failed during deploy

Play bar is not aligned with the other components

The play bar seems to be slightly misaligned on the left with the dropdown & download buttons.

Spanish Is not working today

How can test all in spanish, only spanish?

Valid characters are generating Language not supported. Please use only ISO 8859 characters

When using SSML:
<phoneme alphabet="ipa" ph="nˈaɪtɹəglɪsəɹɨn" ></phoneme> raises the error message: "Language not supported. Please use only ISO 8859 characters"

Investigate out of memory crashes

Long running app can crash due to out of memory error. (With 768 MB).

Should investigate whether this is due to heavy simultaneous usage, a very large input, an ongoing memory leak, or something else.

Pause in the synthesis of "Consapevole"

There is a pause in the middle of the first word "Consapevole" when synthesizing the default Italian text.
This problem seems to only happen when using Firefox. Works ok on Chrome.
Downloading the audio works fine. I only see this problem when the audio is streamed.

Speech should start playing before it finishes downloading

The service supports streaming of audio, however on both Chrome and Firefox wait until the entire download is finished before they begin playing, creating a lot of lag. Our old demo did not had this bug (it started playing immediately although the slider position was incorrect, since it didn't know the file size).

Audio not downloaded as "transcript.mp3" in Chrome

In Chrome, I see the option to download the audio here:

This always downloads the audio as 60aa3761-59f1-4519-baca-f029e9a0d8bc.mp3 instead of transcript.mp3 like when I download through this:

Should the GUID be removed?

"Speak" button is disabled in Safari

The "Speak" button in https://text-to-speech-demo.ng.bluemix.net/ is disabled in Safari.
It works fine in Firefox and Chrome.

"SyntaxError: Unexpected token : "on running node app.js

I'm getting an error while trying to start the node server:

"credentials": {
^
SyntaxError: Unexpected token :
at exports.runInThisContext (vm.js:73:16)
at Module._compile (module.js:443:25)
at Object.Module._extensions..js (module.js:478:10)
at Module.load (module.js:355:32)
at Function.Module._load (module.js:310:12)
at Function.Module.runMain (module.js:501:10)
at startup (node.js:129:16)
at node.js:814:3

My app.js looks like the following and the username and password field are filled in properly

{
"credentials": {
"url": "https://stream.watsonplatform.net/text-to-speech/api",
"password": "passwordhere",
"username": "usernamehere"
}
}

Should have visual feedback while waiting for speech

Pressing "Speak" results in no user-visible feedback until the entire download is finished and speech begins. This can often take 10-20 seconds, resulting in a bad user experience. As soon as the Speak button is pressed there should be some visual indication that the request was sent and is waiting for the server.

ETA on non-supported tags?

I'm making a voice file that requires emphasis on some words and audio clips to be played, but neither of those features are supported. Do you know either A) when those features will become supported or B) what other programs I can use that do support those features?

update this fragment for work

var tts_credentials = extend({
  url: 'https://stream.watsonplatform.net/text-to-speech/api',
  version: 'v1',
  username: 'xxxxxxx',
  password: 'xxxxx',
}, bluemix.getServiceCreds('text_to_speech'));

// Create the service wrappers
var textToSpeech = watson.text_to_speech(tts_credentials);

app.get('/synthesize', function(req, res) {
  var transcript = textToSpeech.synthesize(req.query);
  transcript.on('response', function(response) {
	if (req.query.download) {
	  response.headers['content-disposition'] = 'attachment; filename=transcript.ogg';
	}
  });
  transcript.on('error', function(error) {
	console.log('Synthesize error: ', error)
  });
  transcript.pipe(res);
});

I need update this fragment, because i cant hear speak
Tnks

Text to Speech demo returns 500 error

Overview
I checked out demo project from https://github.com/watson-developer-cloud/text-to-speech-nodejs.
I made .env file and input url and apikey that I got from my IBMCloud account.
Then, run locally.
Access from browser and click speak button.
I got 500 error, and there were error message like screenshots.
After few seconds error message disappear.
I retried to click speak button, then run correctlly.
I recognized that 500 error happens only first time when run localserver by "npm start".

Expected behavior
Run locally, then click speak button first time, then run correctlly.

Actual behavior
Run locally, then click speak button first time, there is 500 error and error message.

How to reproduce
Checked out demo project from https://github.com/watson-developer-cloud/text-to-speech-nodejs.
Follow the Readme and run locally, then click speak button.

Screenshots
Before click speak button

After click speak button

Error message

Additional information:

OS: MacOS Mojave
Which version of Node are you using?: 12.0.0

Quotation marks not supported in text input

Attempting to use input like: "hello" (with the quotation marks) results in the error: "Language not supported. Please use only ISO 8859 characters"

Remove css_browser_selector.js from the dependencies

We need to remove css_browser_selector.js since it has a proprietary license

Ogg Vorbis vs Ogg Opus

The docs (and my experience a while back) indicate that the service only supports Ogg containers with the Opus codec but the demo says "The audio is returned in the Ogg Vorbis format which..."

(FWIW, I'd love to see both codecs supported - some of the IoT hardware I wanted to do a demo with supports Ogg Vorbis but not Ogg Opus.)

Speak button broken

In Firefox (40.0.3 - IBM CCK), if I visit http://text-to-speech-demo.mybluemix.net/ and click the red Speak button, it logs the following:

InvalidStateError: An attempt was made to use an object that is not, or is no longer, usable

The error appears to be coming from the audio.currentTime=0 line.

Chrome appears to work fine.

local deployment without bluemix account?!

Just curious whether this could be deployed as a local instance without bluemix account?

Appreciate if somebody point it to documentation of the kind. Thanks!

Synthesize stops when text contains invalid character(s)

TTS DEMO -- For all in US English Voices only, after pressing "Speak" button the program begins to synthesize text to speech, but stops before the end when text contains invalid character

Expected: For all voices, program synthesizes all text, with any character, to speech
Actual: For US English voices, program does not synthesize text, with invalid character, to speech

Note:
As per Radek, "This is a bug in the demopage; it works fine when communicating directly with https://stream-d.watsonplatform.net/text-to-speech/api/v1/synthesize?voice=en-US_AllisonVoice"

See details below:
When en-US_AllisonVoice, en-US_LisaVoice, or en-US_MichaelVoice is selected, and the text below is entered, and you press "Speak" the program stops speaking at the point "Stops speaking here" is stated in the text box.
And an audio file is successfully created containing only the part spoken out loud and not the entire text.
When any other voice is selected, the program works correctly and completes successfully by not stopping in the middle of the text.

Example text in text box:
Italian:
<phoneme alphabet="ipa" ph=".ʤe.ˈla.to"></phoneme> <phoneme alphabet="ibm" ph="1byan.0ko"></phoneme>

English:
<phoneme alphabet="ipa" ph="təˈmeɪ.ɾoʊ"></phoneme> <phoneme alphabet="ipa" ph="təˈmeɪ.ɾoʊ"></phoneme> <phoneme alphabet="ibm" ph=".0tx.1me.0Fo"></phoneme>

Castillian Spanish:
<phoneme alphabet="ipa" ph=".ˈθiŋ.ko"></phoneme> <phoneme alphabet="ibm" ph="1Tin.0ko"></phoneme>

Italian:
<phoneme alphabet="ipa" ph=".ʤe.ˈla.to"></phoneme> <phoneme alphabet="ibm" ph="1byan.0ko"></phoneme>

English:
<phoneme alphabet="ipa" ph="təˈmeɪ.ɾoʊ"></phoneme> <phoneme alphabet="ipa" ph="təˈmeɪ.ɾoʊ"></phoneme> <phoneme alphabet="ibm" ph=".0tx.1me.0Fo"></phoneme>

Castillian Spanish:
<phoneme alphabet="ipa" ph=".ˈθiŋ.ko"></phoneme> <phoneme alphabet="ibm" ph="1Tin.0ko"></phoneme> -- Stops speaking here

North American Spanish:
<phoneme alphabet="ipa" ph=".in.di.ˈβi.ðwo"></phoneme> <phoneme alphabet="ibm" ph=".0e.1DaD"></phoneme>

Brazilian Portuguese

French:

US English

UK English

German:

Alexandra

Deployment Tracker Service is discontinued

Please refer to the following document for details and migration instructions https://github.com/IBM-Bluemix/cf-deployment-tracker-client-node/wiki/migrating-to-the-new-metrics-tracker-client.

Support for MP3 output

Add support for MP3 output (doesn't have to be streamed). Currently greatest support for audio support in HTML is through MP3. Ogg is only supported by a few browsers. If we had MP3 support it would be easier to use this service in our product.

Add the ability to receive the Word timing metadata of a TTS stream

We currently use TTS for our product, but the quality of this is much better than anything else out there. We'd like to use this, but our product requires that we know the timings of when words (or phonemes) are said. One of the TTS products we currently use gives us this data and we use it. It'd be nice if this would export that metadata along with the audio file so we could use this service.

We currently parse this from the metadata in an id3v2 tags that are stuck in Comment sections. We currently get something like this:

timed_phonemes
word start_in_ms end_in_ms amplitude
word ...
...

Not married to the format, but just to illustrate what type of information we'd need.

TTS Demo -- "Too many requests, please try again later.", when I press "Speak" button too often

Issue: TTS Demo -- Too many requests, please try again later.
When I make about 5 updates to the text box in a minute, I get an error message: "Too many requests, please try again later."

or when I press the "Speak" button, too often (23 times in a row), I get an error message: "Too many requests, please try again later." (See attachment for a screen shot after this test).

Expected: No error message
Actual: I get an error message: "Too many requests, please try again later."

Alexandra
TTS Demo -- Too many requests, please try again later on June 28 2016.zip

Support for semi colon (;)

Based on a question in dAnswers: https://developer.ibm.com/answers/questions/186320/text-to-speech-doesnt-like-semi-colons.html

Having just tried out the text to speech demo (English) all was going well until it reached a semi colon (;) in the text.
It was a fair amount of text (several paragraphs) and I though it might might only read the first paragraph so I deleted this character and tried again and it did indeed read past that point and past all other candidate ending points all the way the end end successfully.
I can only therefore deduce that a semi-colon is not supported? Is this intentional or a bug? Are there any other characters that should be stripped out of the text?

Downloading an audio file with an error in the input text gives a "File not found" and crashes

This issue was marked as fixed & closed earlier, but I still see it - #57

Example:

Select "American English (en-US): Allison (female, expressive, transformable)"
Go to the "Expressive SSML"
Enter this text: <speak> Hello World
Click on Download

I see this (Firefox):

Let me know if you have any questions.
Thanks!

Demo throws error for valid Unicode representation of "E"

The demo page at reports error

Input contains unsupported IPA character 0X00000045

for SSML input

<speak version="1.0">
  <phoneme alphabet="ipa" ph="&#x02C8;&#x0045;s">s</phoneme>
</speak>

where ˈEs should correpond to 'Es

Add GDPR Language to Stand Alone Demo Pages

Context:
GDPR will go into effect on May 25th; we need the demos to display the following language in order to make them compliant with the law:

This system is for demonstration purposes only and is not intended to process Personal Data. No Personal Data is to be entered into this system as it may not have the necessary controls in place to meet the requirements of the General Data Protection Regulation (EU) 2016/679.

For more information see: https://github.ibm.com/Watson/developer-experience/issues/4342

watson-developer-cloud / text-to-speech-nodejs Goto Github PK