felixwaweru / elevenlabs-node Goto Github PK

View Code? Open in Web Editor NEW

163.0 163.0 23.0 150 KB

Eleven Labs text to speech package for NodeJS.

Home Page: https://www.npmjs.com/package/elevenlabs-node

License: MIT License

JavaScript 100.00%

ai ai-voice elevenlabs elevenlabs-api nodejs npm text-to-speech tts

elevenlabs-node's Introduction

Hi there

📫 Socials

🕵🏽‍♂️ _{^{_{^{Just tracking something. You can ignore me}}}}

elevenlabs-node's People

Contributors

Stargazers

Watchers

elevenlabs-node's Issues

Passing from server to client

Hi, I'm trying to get the audio via a firebase function and passing it back to the client to play, but I can't seem to get it to work, how can I make it something I am able to pass by to the client?

ElevenLabs V2

EvelevenLabs V2 has been release yesterday: https://github.com/elevenlabs/elevenlabs-python/releases/tag/v0.2.26

Thank you in advance!

Description

Add ElevenLabs API Speech Boost to TTS functions.

Define API key at instance creation

It would probably be pretty nice to define the API key when the instance is created. I can add this in if you agree

Text to speech not waiting until the audio is created.

Description of the problem
Currently, the textToSpeech function in its current implementation, returns a response before the audio file has been fully created. As a result, when attempting to read the audio file immediately after calling textToSpeech, the file might not exist yet, which can lead to errors.

Steps to reproduce

Call the textToSpeech function and save the response in a variable.
Immediately afterwards, try to read the audio file returned by textToSpeech.

Expected behavior
The textToSpeech function should return only after the audio file has been completely created and it can be accessed without any issues.

Possible solution
One could modify the textToSpeech function to return a promise that resolves only after the audio file has been completely created. Below is an example of how this solution could be implemented:

const textToSpeech = async (apiKey, voiceID, fileName, textInput, stability, similarityBoost, modelId) => {
try {
// ... existing code ...

const writeStream = fs.createWriteStream(fileName);
response.data.pipe(writeStream);

return new Promise((resolve, reject) => {
  writeStream.on('finish', () => resolve(fileName));
  writeStream.on('error', reject);
});

} catch (error) {
console.log(error);
}
};

In this code, we create a new promise that resolves with the file name only after the writing of the file has been completed. This way, the textToSpeech function resolves only after the audio file has been fully created.

I hope this suggestion is useful and can be implemented to avoid issues when trying to access the audio file immediately after calling textToSpeech. I appreciate your attention and look forward to a prompt resolution.

Alternative Audio Storage Option

Is your feature request related to a problem? Please describe.
I would like to generate speech using text-to-speech API, then store the mp3 not locally on my machine, but on a server (or in a amazon bucket or firease storage bucket). The way the repo is currently structured, this seems not to be supported.

Describe the solution you'd like
Ideally an alternative return value in form of the blob audio should be provided (instead of writing to the filesystem).

Describe alternatives you've considered
Using elevenlabs API directly (which I'm doing currently).

Add Response Type Variable to textToSpeechStream

Describe the solution you'd like
Addition of customizable response types for the textToSpeechStream function. i.e (arrayBuffer, stream, etc)

Not creating file and store the data

I try to pass the filename, but it is not creating file

/**
 * text to speech
 */
const textToSpeech = async () => {
	try {
		voice
			.textToSpeech(
				xiKey,
				'wnDO7pToxAu1uMylONP0',
				'1.mp3',
				'Welcome to callanswering!, My name is Sagar Davara and how may I help you ?'
			)
			.then((res) => {
				console.log(res)
			})
	} catch (error) {
		console.log(error)
	}
}

Add More Audio Streaming Options

Description

Add more options to the audio streaming feature

Features

optimize_streaming_latency
output_format

Reference

https://elevenlabs.io/docs/api-reference/streaming

Add package publish Github action

Create a Github workflow that builds, tests and publishes the package on NPM.
Configure to make it run on test branch.

No types 😕

😕

Allow changing models (multilingual support)

reduce API latency option

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
https://help.elevenlabs.io/hc/en-us/articles/15726761640721-Can-I-reduce-API-latency
Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
In which place should I put it?

Additional context
Add any other context or screenshots about the feature request here.

Create Voice

There should be the option to create a voice, since the API supports it. It would function almost identically to the "edit voice" function.

ElevenLabs MultiLingual v2

I am stupid. thank you.

Language

How can i switch the language?

I don´t know if it is possible because also I coudn´t find that option in the oficial api documentation.

Speech-to-speech support

Is your feature request related to a problem? Please describe.
11labs has a new API endpoint "Speech-to-speech" https://elevenlabs.io/docs/api-reference/speech-to-speech

Describe the solution you'd like
Native support in your API

Add tests

Add Jest unit tests for functions.

Add please function for GET /v1/models

Simple feature request, to add function for GET /v1/models request
https://docs.elevenlabs.io/api-reference/models-get

Add Websocket support

Is your feature request related to a problem? Please describe.
Websockets don't seem to be supported by this library, would be good to add.

Describe the solution you'd like
A wrapper function similar to the current ones that lets you setup websockets.

Describe alternatives you've considered
Simply using the raw Websocket endpoint

Not quite sure if this is a issue or not, but when using the package for small amounts of text it works just fine, but when using it with larger samples it just errors...
To be fair I have myself got the same exact error when trying to use the API directly, but I'm not quite sure why this is happening. According to the API documentation, the max amount of characters should be 5000, but both your package and my direct use of the API error after around 500 characters

Speaking the other languages

Describe the bug
I read on Eleventlabs docs that Portuguese is one of the supported languages, and I'm trying to make it speak in Brazilian Portuguese, but I can't.

In fact, I didn't find where to say language:pt-BR or something like that. I just sent a Brazilian written text.

But the result is some like an American trying to read in Portuguese, it's annoying. 😅

Expected behavior
To ask voice.textToSpeechStream with the desired language as a parameter.

. . .

Could you guys help me make it work?

Add CONTRIBUTING.md to project

Add a CONTRIBUTING.md file to the root folder to specify the open source contribution guidelines.

Additionally, link it in the README and Github repo sidebar.

felixwaweru / elevenlabs-node Goto Github PK

elevenlabs-node's Introduction

Hi there

📫 Socials

🕵🏽‍♂️ Just tracking something. You can ignore me

elevenlabs-node's People

Contributors

Stargazers

Watchers

Forkers

elevenlabs-node's Issues

Description

Description

Features

Reference

Recommend Projects

Recommend Topics

Recommend Org

🕵🏽‍♂️ _{^{_{^{Just tracking something. You can ignore me}}}}