gexgd0419 / naturalvoicesapiadapter Goto Github PK

View Code? Open in Web Editor NEW

84.0 9.0 4.0 27.4 MB

Make Azure natural TTS voices accessible to any SAPI 5-compatible application.

License: MIT License

C++ 81.35% C 18.65% CMake 0.01%

microsoft-tts sapi5 text-to-speech tts tts-engine microsoft-cognitive-services speech-synthesis natural-voice

naturalvoicesapiadapter's Introduction

NaturalVoiceSAPIAdapter

查看中文文档请点击这里

An SAPI 5 text-to-speech (TTS) engine that can utilize the natural/neural voices provided by the Azure AI Speech Service, including:

Installable natural voices for Narrator on Windows 11
Online natural voices from Microsoft Edge's Read Aloud feature
Online natural voices from the Azure AI Speech Service, if you have a proper subscription key

Any program that supports SAPI 5 voices can use those natural voices via this TTS engine.

See the wiki pages for some more technical information.

System Requirements

Minimum tested platform: Windows XP SP3, and Windows XP Professional x64 Edition SP2.

Minimum platform that supports local Narrator voices: Windows 7 RTM, x86 32/64-bit.

Minimum platform that supports installing Narrator voices via Microsoft Store: Windows 10, build 17763.

How can I install Narrator natural voices on Windows 11?

Go to System Settings > Accessibility > Narrator, scroll down to Narrator's voice, then click the Add button for Add natural voices.

If your system isn't new enough to have this option, see the methods below.

I'm using Windows XP/Vista/7/8/10. Can I use the Narrator natural voices from Windows 11?

Windows XP/Vista: Unfortunately local Narrator voices are not supported on those platforms. But online voices, including Edge and Azure voices, still work.

Windows 10 (build 17763 or above): You can choose and install Windows 11 Narrator voices using these Microsoft Store links.

Windows 7/8/10 (before build 17763), or if you can't use the Microsoft Store:

Copy the Microsoft Store link of a Windows 11 Narrator voice from here.
Use store.rg-adguard.net to get a link to download the MSIX file of the voice.
Prepare a folder to store the voice folders. Make sure its path contains no non-ASCII character.
Unzip the MSIX file (as if it were a ZIP file) to its sub folder. You can have multiple voice sub folders in the same parent folder. Make sure the sub folder's name contains no non-ASCII character.
Set the parent folder as "Local voice path" in the installer.
Do not put things other than voice sub folders inside this parent folder, or voice loading may fail.

Windows 10's Narrator doesn't support natural voices directly, but it does support SAPI 5 voices. So you can make Windows 11 Narrator voices work on Windows 10 via this engine.

Will it work on future versions of Windows?

This engine uses some encryption keys extracted from system files to use the voices, so it's more like a hack than a proper solution.

As for now, Microsoft hasn't yet allowed third-party apps to use the Narrator/Edge voices, and this can stop working at any time, for example, after a system update.

Installation

Download the zip file from the Releases section.
Extract the files in a folder. Make sure not to move, rename or delete the files after installation. If you want to move/delete the files, you should uninstall it first.
Run Installer.exe.
It will tell you if the 32-bit version and the 64-bit version have been installed, in the "Installation Status" section.
- The 32-bit version works with 32-bit programs, and the 64-bit version works with 64-bit programs.
- On 64-bit systems, to make this work with every program (32-bit and 64-bit), you need to install both of them.
- On 32-bit systems, the "64-bit" row will not be shown.
Click Install/Uninstall. Administrator's permission is required.
Choose what kinds of voices you want to use. By default, local Narrator voices (if supported) and Microsoft Edge Read Aloud online voices are enabled.
- Online voices require Internet access, and they can be slower and less stable. If you only want to use the local Narrator voices, you can uncheck "Enable Microsoft Edge online voices" and "Enable Azure online voices".
- As there are many online voices, by default, only those in your preferred languages and in English (US) are included, to avoid cluttering the voice selection list. Click "Change..." to change what languages are included.
- Azure voices require a subscription key (API key) and its region. Click "Set Azure key" to enter your key. You can visit Azure Portal, go to your speech service resource, then go to Resource Management > Keys and Endpoint to copy & paste the key and the region.
Close the Installer window to apply the changes. You can open the Installer again when you want to change something, and changing the settings doesn't require reinstallation or administrator's permission.

Or, you can use regsvr32 to register the DLL files manually.

For advanced users, here's a list of this program's configurable registry values.

Testing

You can use the TtsApplication.exe in folders x86 and x64 to test the engine.

It's a modified version of the TtsApplication in Windows-classic-samples, which added Chinese translation, and more detailed information for phoneme/viseme events.

Or, you can go to Control Panel > Speech (Windows XP), or Control Panel > Speech Recognition > Text to Speech (Windows Vista and later).

Libraries used

Microsoft.CognitiveServices.Speech.Extension.Embedded.TTS
websocketpp
ASIO (standalone version)
OpenSSL
nlohmann/json
YY-Thunks (for Windows XP compatibility)
spdlog

naturalvoicesapiadapter's People

Contributors

Stargazers

Watchers

Forkers

qsq3 jymnils2 arcosbr eagalon

naturalvoicesapiadapter's Issues

能否提高离线讲述人自然语音的响应速度

开发者您好，我使用的是nvda屏幕阅读器。在使用自然语音朗读时，有点儿不太跟手，也就是说当我按下按键到它发出声音有短暂的延迟，这个延迟在打字之类的场景的时候，影响最大。不知道能否进行优化，再次提升一些响应速度。

Using the online models with pyttsx3 prints a bunch of logs, how to disable them?

So i just installed the online voices, and when i tried to use them with pyttsx3 it printed these logs.

How can i disable these logs

[2024-08-06 16:07:18] [connect] Successful connection
[2024-08-06 16:07:19] [connect] WebSocket Connection [2620:1ec:c11::237]:443 v-2 "WebSocket++/0.8.2" /consumer/speech/synthesize/readaloud/edge/v1?TrustedClientToken=6A5AA1D4EAFF4E9FB37E23D68491D6F4 101
[2024-08-06 16:07:19] [frame_header] Dispatching write containing 2 message(s) containing 16 header bytes and 722 payload bytes
[2024-08-06 16:07:19] [frame_header] Header Bytes:
[0] (8) 81 FE 01 80 A8 44 1D 7F
[1] (8) 81 FE 01 52 BF 28 9A F6

[2024-08-06 16:07:19] [frame_payload] Payload Bytes:
[0] (384) [1] ≡iI▬┼!n
╔)mEÜt/Kàt%RÿrINÿ~.HÆu$Q¢t(%ÑN^►╞0x◄▄iI♠╪!'▲╪4q▬╦%i▬╟*2§█+sDê'u▲┌7x
ò1i↓à|►u°%i↨Æ7m→═'uQ╦+s↓┴#►uÑNf]╦+s
═<i]Æ??
╤*i↨═7t
è~f]╔1y▬╟f'♦è)x
╔ |
╔
m
┴+s
è~f]╩+r¶┼%o¶φ*|↔─!y]Æ0o
═h?▌*~
▌%i▬╟*_►▌*y▲┌=X◄╔&q→╠f'↓╔(n→äfn→╞0x◄╦!_►▌*y▲┌=X◄╔&q→╠f'
┌1xSè2t
═)x:╞%‼═ ?E▄6h→äfj►┌ _►▌*y▲┌=X◄╔&q→╠f'
┌1x☻äfr
▄4h
╔0xR┼+s►à)mLè91]─%s↑▌%z→è~f]╔1i►∞!i→╦0t►╞f'↓╔(n→╒9`☻
[1] (338) [1] τ♣╬ƒ╥MΘé▐EΩ╠ì↑¿┬Æ↑ó█Å▲╬╟Å↕⌐┴à↓ú╪î↑»¼▓"┬█φMδâ┌[ε┐█↕°└ë↔«╧ë↓∙╬ì►«┴ì◄√└ç▼∙ù█K∙öÄù█ⁿⁿG⌠é┌Fε█δQΩôàIΩå╙A∙ù╦A⌡ÿÉ[Θ¢╙♥Γ¢╙%Éª▐\≥╠╠[≈Ü▓"ùⁿâ[Ωô▐C║Ç┌ZΘƒ╨Fº╤Ä♠¬╤ƒP≈Ü╤[º╤╫\εåà╡ü╚_┤üî♠⌡ä╪¿╞Å↓╡╟ÅΘÅ╤\≥ô╠AΘ╤ƒP≈Ü╤[á¢╠\εàé≥é╦Xá┘É_φüæ_⌐╪╨Z²┘ì↑¬╟ÉEΘé╦[╜╓╟E÷╠╙I⌠æé ÿÆ}╔╤ü¶∞Ö╓K ╓╤I≈ôé ÿÆ}╔█°]π╕┌]Φù╙ñ┐φù√éƒ\≥ƒ≈Ö╥M⌠éƒ\≥ù≥ôƒC⌠ô╚    ª┘╔G≤ò┌▬ª┘╠X ù╘▬

[2024-08-06 16:07:19] [frame_header] Dispatching write containing 1 message(s) containing 6 header bytes and 2 payload bytes
[2024-08-06 16:07:19] [frame_header] Header Bytes:
[0] (6) 88 82 91 3E 00 37

[2024-08-06 16:07:19] [frame_payload] Payload Bytes:
[0] (2) [8] 92 D6

[2024-08-06 16:07:19] [control] Control frame received with opcode 8
[2024-08-06 16:07:19] [error] handle_read_frame error: asio.system:10054 (An existing connection was forcibly closed by the remote host.)
[2024-08-06 16:07:19] [info] asio async_shutdown error: asio.system:10054 (An existing connection was forcibly closed by the remote host.)
[2024-08-06 16:07:19] [disconnect] Disconnect close local:[1006,An existing connection was forcibly closed by the remote host.] remote:[1000]
[2024-08-06 16:07:34] [connect] Successful connection
[2024-08-06 16:07:35] [connect] WebSocket Connection [2620:1ec:c11::237]:443 v-2 "WebSocket++/0.8.2" /consumer/speech/synthesize/readaloud/edge/v1?TrustedClientToken=6A5AA1D4EAFF4E9FB37E23D68491D6F4 101
[2024-08-06 16:07:35] [frame_header] Dispatching write containing 2 message(s) containing 16 header bytes and 701 payload bytes
[2024-08-06 16:07:35] [frame_header] Header Bytes: 
[0] (8) 81 FE 01 80 46 FA D1 7D
[1] (8) 81 FE 01 3D D6 17 E4 F2

[2024-08-06 16:07:35] [frame_payload] Payload Bytes:
[0] (384) [1] ▲╫à¶+ƒó   'ùíGt╩πIk╩ΘPv╠àLv└ΓJ|╔ΣSv╠Θ'K≡Æ↕(Ä┤‼2╫à♦6ƒδ∟6è╜¶%¢Ñ¶)ö■↨5ò┐FfÖ╣∟4ë┤     {ÅÑ┬▄w▬¢Ñ§|ëí↑#Ö3ö▓ò┐▄wK3¢Ñ¶)öô↕3ö╡∟4âö‼'ÿ╜↑"╪δó↑j╪ó↑(Ä┤‼%ƒô↕3ö╡∟4âö‼'ÿ╜↑"╪δ    4Å┤Qdî╕#ù┤8(¢│◄#₧≤G2êñ↑j╪ª↕4₧ô↕3ö╡∟4âö‼'ÿ╜↑"╪δ 4Å┤j╪2èñ òú►'Ä≤Gd¢ñ↓/òⁿOræ╣k├τ▬$ôÑ'Ä┤P+ò┐↕kùíNdç²_*¢┐→3¢╢↑d└¬_'ÅÑ↕☻ƒÑ↑%Ä╕↕(╪δó↑;ç¼
[1] (317) [1] Ä:░¢╗rùå╖zö╚Σ'╓╞√'▄▀µ!░├µ-╫┼∞$╤▄µ!▄¿█↔╝▀äròç│dÉ╗▓-╘ùαuåæ▓%Ç╩∩s╨ö░våô▓!╨├Σ/Ç├│'üù╡&Θ°òxèå│yÉ▀énöù∞vöé║~çôó~ï£∙dùƒ║<£ƒ║→εó╖cî╚Ñdë₧█↔Θ°Ωdöù╖|─ä│eù¢╣y┘╒τ9╘╒÷oë₧╕d┘╒╛cÉé∞8╦àí`╩àσ9ïÇ▒8╓┬µ&╦├µ8ùï╕cîùÑ~ù╒÷oë₧╕d▐ƒÑcÉüδ0îåóg▐▌∙`ôà°`╫▄╣eâ▌Σ'╘├∙zùåód├╥«zê╚║vèòδ0ü£√B╖╒Φ+Æ¥┐tü╥╕vëùδ0ü£√B╖▀æb¥╝│bûô║0┌▐÷ü╥░bçÖ│s─çª+╦ä╣~çùΦ+╦üªràÖΦ      

[2024-08-06 16:07:35] [frame_header] Dispatching write containing 1 message(s) containing 6 header bytes and 2 payload bytes
[2024-08-06 16:07:35] [frame_header] Header Bytes:
[0] (6) 88 82 78 59 43 06

[2024-08-06 16:07:35] [frame_payload] Payload Bytes:
[0] (2) [8] 7B B1

[2024-08-06 16:07:35] [control] Control frame received with opcode 8
[2024-08-06 16:07:35] [error] handle_read_frame error: asio.system:10054 (An existing connection was forcibly closed by the remote host.)
[2024-08-06 16:07:35] [info] asio async_shutdown error: asio.system:10054 (An existing connection was forcibly closed by the remote host.)
[2024-08-06 16:07:35] [disconnect] Disconnect close local:[1006,An existing connection was forcibly closed by the remote host.] remote:[1000]

Problem with adjusting what languages of the Edge voices are shown

Greetings,
In the installer, when the box for including the Microsoft Edge voices has been checked, there are two options. You can either show all supported languages or the current display language and American English. It seems that for some applications, there are too many voices or something when I choose to display all supported languages, because not all the voices show up in JAWS for example. Is there no way to customize exactly what languages are available? I checked the registry and searched for some sort of config file, but the only things I found were those two options. I assume that the installer automatically detects your display language and then adds English US, and everything else is basically hard coded. Is there no way of changing this? There are certain languages where I really need a speech synth, but I don't need all of them and the ones I need don't show up in JAWS.

With Microsoft Edge online voices, NVDA's continuous reading stops after the first sentence

Greetings, and thanks for your awesome efforts.
The problem is that NVDA can't use Microsoft Edge online voices in continuous reading. As continuous reading starts with NVDA+Down, the first sentence is read, but reading stops after that. So, in effect, continuous reading can't be used in NVDA With Microsoft Edge online voices. I've tested NVDA 2024.2 Release Candidate and NVDA 2024.3 alphas. It doesn't affect the offline natural voices. Can something be done to take care of this issue?

<HELP> Microsoft natural voices wont show up in balabolka

I have windows 11 and I've downloaded the "NATURALVOICESSAPIADAPTER" too.

I've used the voices back when I had windows 10, with the same software (balabolka, the adapter, natural voices from Microsoft store like Ryan and guy)

But then I reset my pc because of some HDD issues, and the voices disappeared, only the online voices like ryan online show in balabolka. But the natural voices like ryan and guy show in the natural voices narrator in settings.

The voices also show up in the TTSdemo application, they arent just showing up in

Ive redownloaded balabolka, the sapiadapter, and the voices, no luck

offline voices not showing

after installation, the tester only found online voices (which after sometime they crash), i tried to move the installer and only install the offline voices but then running the tester only showed zira and david (mark nowhere to be found)

Also narrator shows every voice twice none of them (excluding Zira and David) working

does ssml language work for multilingual voices?

for the online multilingual voices, microsoft recommends using ssml to force the language when it is not detected right.
does your project support that?
I could not make this work, for example, with the brian multilingual voice selected in the tts application supplied in the packge, with process xml on:
<lang xml:lang="ro-RO"
ce zici de asasin?

it speaks French, not Romanian. I also tried enclosing the text in a element, setting its lang attribute, but no go.
Thanks.

natural voice sapi adapter api key

Hello.
in the description of this vehicle, it says that with the api key all available sounds can be used.
However, unfortunately I couldn't find a place where I can enter the api key.
i have an existing api key and i want to use azure sounds directly, not edge sounds.
where do I enter the api key?

does not work in adobe acrobat

it seems that the added voice does not appear in adobe acrobat

please release new version

Since May 10th, there are many changes, but the releases page has no the new version.

Word boundary event not working for online voices

The word boundary event works well for the offline (Narrator) voices but isn't working properly for the online Edge voices. I'm pretty sure the online voices do send word boundary information as I have this working in an Edge extension that uses the voices directly.

Just for a query, how did you made the mic animation in the SAPI5 TtsApplication?

i just found out that there was an application to test out the voices and i found out that the mic at the top-left was animated like wtf?

how in the world did you do that?

if possible can you point me to the specific code part or if you got it from somewhere else then please take me to the specific part of it, i want to implement it in python cause i am working on my final year project and i want to create something similar

You can find some help here