Git Product home page Git Product logo

wavenet-for-chrome's People

Contributors

dependabot[bot] avatar evoludo avatar joakimnil avatar marcolivierbouch avatar pgmichael avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wavenet-for-chrome's Issues

Shortcut download a 0 byte file.

Using the shortcut Ctrl+Shift+E downloads an empty 0 byte file. Texts are selected while key press.
Running from the context menu work fine.

Google Chrome
Version 87.0.4280.66 (Official Build) (64-bit)

Audio - Invalid argument

I get an error:

Failed to synthesize text
Request contains an invalid argument.

for SSML like:


Audio MP3 inclusions worked fine just yesterday, but not today.

I tried a couple hostings and WAV-format as well, same thing.

Any ideas?

Replicating Wavenet-for-Chrome in Google TTS NodeJS

This is brilliant, so handy.

Just a query regarding bitrate and overall quality. When using "Download as MP3" the quality from Wavenet-for-Chrome far exceeds what I am getting out of using the NodeJS version of Google Text-to-Speech.

https://github.com/pgmichael/wavenet-for-chrome/blob/master/js/background.js

I take it you are using LINEAR16 to create the better quality bitrate and then writing it as an MP3?

Here is my code:

    // Start the Google Text-to-Speech program   
    const client = new textToSpeech.TextToSpeechClient();

    const request = {
      input: { text: myMessage },
      // Select the language and SSML Voice Gender (optional)
      voice: { languageCode: langcode, name: voicename },
      // Select the type of audio encoding
      audioConfig: { audioEncoding: 'MP3' },
    };

    // Performs the Text-to-Speech request
    const [response] = await client.synthesizeSpeech(request);
    // Write the binary audio content to the temp
    const writeMP3File = util.promisify(fs.writeFile);
    await writeMP3File('/tmp/' + "tts.mp3", response.audioContent, 'binary');
    console.log('MP3 audio content written to temp');

This is what I get from using this Node code
Audio: mp3, 24000 Hz, mono, fltp, 32 kb/s

Can you help edit this code so I get the same top quality as I get when I use your plugin? (I use NodeJS for the bigger files)

SSML not working in sandbox

Tried using Sandbox with SSML, unfortunately the tags are not recognized and are always read. Whether I use <script></script> or encode it like <script> it's always read out loud.

Mixed ffprobe outputs over last 2-3 weeks MP3 creation

I have downloaded MP3s created from SSML for some projects to add to videos then create with ffmpeg. Has something changed over the last month? As the ffprobe outputs for each batch of work we did seems to have changed (3 times by the looks of it). This has lead to some videos created not playing properly on certain devices.

Work done on 19th Jan, ffprobe output:

Input #0, wav, from 'download (65).mp3':
  Duration: 00:00:18.04, bitrate: 384 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 24000 Hz, 1 channels, s16, 384 kb/s

Work done on 23rd Jan, ffprobe output:

Input #0, wav, from 'download (0).mp3':
  Duration: 00:00:35.94, bitrate: 384 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 24000 Hz, 1 channels, s16, 384 kb/s

Work done on 31st Jan, ffprobe output:

Input #0, ogg, from 'download (0).mp3':
  Metadata:
    TLEN            : 37003
  Duration: 00:00:37.00, start: 0.000000, bitrate: 23 kb/s
    Stream #0:0: Audio: opus, 48000 Hz, mono, fltp

Running it today 19th Feb now brings another set of file attributes!

Input #0, mp3, from 'download (1).mp3':
  Duration: 00:00:25.30, start: 0.000000, bitrate: 32 kb/s
    Stream #0:0: Audio: mp3, 24000 Hz, mono, fltp, 32 kb/s

But BTW, this plugin is brilliant! Just a bit of an issue with using these with video creation (ffmpeg) for me.

PS - Just for additional reference, this is the output from a file in June 2019 I created via Google TTS in Node myself

Input #0, wav, from 'download.mp3':
  Duration: 00:00:33.06, bitrate: 384 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 24000 Hz, 1 channels, s16, 384 kb/s

Failed to synthesize text - new error experienced

"Trying to acquire 1 unit; currently 1500 units in use; maximum of 1500 units allowed. Server: /bns/jt/borg/jt/bns/cloud-ml-tts-composer/prod-standard-voice-global.tts-composer/4"

tried to use smaller text grouping, didn't help.

"Queue mode"

Add an option to enable "queue mode" which changes the behaviour when activating the extension. Normally, if it is already speaking, it will interrupt the previous speech with whatever new selection is sent. Instead, in this mode, it will just keep queueing requests back to back. If the user stops playback, it will jump to the next queued selection of text, and so on. This is good for sites that have a lot of crap between sections of text, so this way we could queue up all the readable text on the site without having to listen to it read out URLs and image sources.

ssml not working as of ver 10 update

As of today 7/26/23 ssml is no longer working.

exsample
<speak> <prosody pitch="low">Absolutely!</prosody> The power of the Horde is <emphasis>undeniable</emphasis>! For the Horde! </speak>

Adding the extension

Greetings,

Unfortunately, it is not possible to add Wavenet from the Chrome Store and also manually from the extension development methods. It gives some sort of certificate error.

Also, because I have this extension on Chromium as well, I receive a new sort of error that states the API key is not valid which is wrong, and after copy-pasting it multiple times it works.

As very handy as this extension, it will be appreciated in advance if you would fix the issue.

Thank you very much.

Is it possible to read ssml?

Thanks for making this. It's been of great help, but is it possible to read SSML. Right now it reads the syntax.

Mp3 Download?

Pg, I love your work, it has been a blessing for me and my family, but it would be perfect if it had the "download as mp3" functionality.

Anyways, cheers and thank you again.

external use

How can i use the extension from my own scripts in local inside my own tabs?

`npm run build` does not work on Ubuntu 18.04

Hi, I love this package and I use it everyday! It changed my life. I wanted to modify it this morning but I ran into the following error when I tried to npm run watch

Note

  • Ubuntu 18.04
  • node v8.10.0
  • npm v3.5.2

My Error

npm ERR! Linux 4.15.0-140-generic
npm ERR! argv "/usr/bin/node" "/usr/bin/npm" "run" "build"
npm ERR! node v8.10.0
npm ERR! npm  v3.5.2

npm ERR! Invalid version: "5"
npm ERR! 
npm ERR! If you need help, you may report this error at:
npm ERR!     <https://github.com/npm/npm/issues>

npm ERR! Please include the following file with any support request:
npm ERR!     /home/conor/Dropbox/07_liquidity/18_wavenet_for_chrome/wavenet-for-chrome/extension/npm-debug.log

oh and here is the npm-debug.log

(py383) ➜  extension git:(master) ✗ cat npm-debug.log 
0 info it worked if it ends with ok
1 verbose cli [ '/usr/bin/node', '/usr/bin/npm', 'run', 'build' ]
2 info using [email protected]
3 info using [email protected]
4 verbose stack Error: Invalid version: "5"
4 verbose stack     at Object.fixVersionField (/usr/share/npm/node_modules/normalize-package-data/lib/fixer.js:191:13)
4 verbose stack     at /usr/share/npm/node_modules/normalize-package-data/lib/normalize.js:32:38
4 verbose stack     at Array.forEach (<anonymous>)
4 verbose stack     at normalize (/usr/share/npm/node_modules/normalize-package-data/lib/normalize.js:31:15)
4 verbose stack     at final (/usr/share/npm/node_modules/read-package-json/read-json.js:338:5)
4 verbose stack     at then (/usr/share/npm/node_modules/read-package-json/read-json.js:113:5)
4 verbose stack     at /usr/share/npm/node_modules/read-package-json/read-json.js:300:12
4 verbose stack     at /usr/share/npm/node_modules/graceful-fs/graceful-fs.js:76:16
4 verbose stack     at FSReqWrap.readFileAfterClose [as oncomplete] (fs.js:511:3)
5 verbose cwd /home/conor/Dropbox/07_liquidity/18_wavenet_for_chrome/wavenet-for-chrome/extension
6 error Linux 4.15.0-140-generic
7 error argv "/usr/bin/node" "/usr/bin/npm" "run" "build"
8 error node v8.10.0
9 error npm  v3.5.2
10 error Invalid version: "5"
11 error If you need help, you may report this error at:
11 error     <https://github.com/npm/npm/issues>
12 verbose exit [ 1, true ]

Add KB shortcut to read or read whole page

some sites for online book reading e.g. O'Reilly.com, when I select text and right click to go into the read option, it deselects the text and hides the menu option.

Could we have an option to read the main text on the body or the ability to highlight text and then execute the read via a keyboard shortcut.

For reference, the read aloud chrome plugin allows this (read whole page) but doesn't support the GoogleWavenet voices ;)

Increase quotas

I am always getting the same message. Even though sentences are quite short.


Either input.text or input.ssml is longer than the limit of 5000 bytes. This limit is different from quotas. To fix, reduce the byte length of the characters in this request, or consider using the Long Audio API: https://cloud.google.com/text-to-speech/docs/create-audio-text-long-audio-synthesis.


I read the article on Google but I don't understand what I have to do.

How can I increase that limit??

On my API I see all this cuotas:

Count of requests for Neural2 voices per minute
All requests per minute
Count of requests for Long Audio Synthesis per minute
Count of requests for querying Long Audio Synthesis operations per minute
Count of requests for Studio voices per minute

What is exactly the process??? Can anybody help me with this? Thanks!!

Download not working when SSML used

Thank you for the great extension which saves a ton of time.

I am facing an issue with download using sandbox and right click method.
Here is the scenario.

The below content throws error.
Studio O is selected as voice.

<speak>
In a lush jungle, a thankful tiger and a cheerful monkey were close friends. <break time="200ms" /> They often explored the jungle together, admiring its beauty.
</speak>

When I remove the break time, the download works fine.
So the below content works.

<speak>
In a lush jungle, a thankful tiger and a cheerful monkey were close friends. They often explored the jungle together, admiring its beauty.
</speak>

Also, there are no issues when I press "Read Aloud" using both content. Only downloading is the problem

image

Voices missing for Nederlands / Dutch

Your plugin is fantastic. The only thing missing are the voices missing for Netherlands/Dutch: Wavenet A/B/C/D etc. Would be awesome if you could add them!

Shortcuts not working for me

Hi,

Thanks for the awesome extension!

None of the shortcuts configured via chrome://extensions/shortcuts work for me. Additionally, there is no configurable shortcut to pause the audio.

I'm using Chrome browser production build on Mac OS 10.15.2.

Let me know if I can provide any additional info.

Highlight text when reading

This is an advanced feature but has the potential to really differentiate this extension from the rest.

Android users enjoy a Google Assistant feature called "read it", what is great with it, apart from free wavenet voices, is that it highlight the text so you can read while listening... This is a game-changer and significantly boosts understanding!

I would love to have this feature here! This extension is the freaking best, and I know you @pgmichael or others can add this feature!

107005848-98b6cb00-6790-11eb-9fff-564fe337e17b

Seemingly random skipping of sentences when reading over 5000 characters.

I've noticed that sometimes portions of sentences and sometimes entire sentences get completely skipped when reading >5000 characters. My guess is that it has something to do with the ReGex that handles splitting the text, but I can't be sure of that. Either way, it doesn't happen with every body of text. I think it happens more in bodies with odd characters sprinkled in, but I've also had the last sentence just get skipped with no explanation before. Easiest way to test would be to just select all the text on various sites and see if it reads everything.

Firefox version?

I'm sure this is a bit of a reach, but due to Chrome's impending removal of Manifest V2, I am beginning to start transitioning to Firefox. However, this extension doesn't seem to have a similar equivalent on Firefox. I'm not sure how simple it is to port this extension to Firefox, but is that in the cards? Even if it's just this current version and never gets updated, it would help me a great deal with my transition, as most of my other extensions are available. I'm sure I'm not the only person thinking this, but I thought I'd put it out there. I use this extension liberally and it would be a huge loss for me to not be able to use it on Firefox. Thank you for your consideration.

Google Docs

I can't seem to play or download scripts through google docs in chrome, set the shortcuts to be global in chrome as well but no luck, any ideas?

ITS SHOWING aPI KEY EXPIRED

even my account working well and just registerd yeaterday and after facing this issue i deleted old api key and generated new than also its showing same error

I love this app.

I love, love this app. I struggle to read and having Wavenet as an option is fantastic. Though I think that by the time I've listened to a novel it may cost the same as buying the audio version. LOL

Anyhow, would love to see this ported out of chrome into a desktop app that could read text copied to the clipboard automatically. Be even more useful if it displayed and highlighted the text in a pop-up window.

Thank you for your work!

Add options for other audio formats

Google WaveNet produces the best results when choosing WAV as the output file format. Would it be possible to add an option selecting the output format? Would be much appreciated!

Kind regards,
Michael Smilauer

Add a word filter list

Add a user-customizable list of words or text cues to find and replace with different text before sending the query to WaveNet. This could allow us to fix bad pronunciation of commonly spoken words.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.