To do:
- Rework generated text to Local TTS - added, but really janky.
- too slow
- sound quality garbage * length is an issue
- Give bark metal and cuda capabilities
This can eventually be incorporated into Amadeus, which is why I want to work on it.
Current problems:
- Text generation is slow
- Voice generation is slow
Proposed improvement:
- Fine tuning/etc with OpenAI's GPT assistants setting
- Local RVC
- Local TTV
- Try local LLM, but probably would be slower depending on what machine you have. Make that an option (compare local specs against openai benchmarks then pick the better one)
I'm gonna work on:
- Frontend
- Improving the bot to be cooler
- Local RVC and TTS
This youtube channel might be useful: https://www.youtube.com/@Jarods_Journey
Used this repo for TTS: https://github.com/suno-ai/bark