Git Product home page Git Product logo

rvg_tts's Introduction

RVG TTS

The retrieval based voice generation text to speech system is a python based text to speech that relies on two core parts. to be able to generate speech, It relies on tacotron to convert the text to speech and then uses rvc voice conversion to be able to make it sound like any character without the need to use an audio file.

Requirements

This tts has been tested on python 3.10 although might work on other versions.

You are required to have the latest 64 bit Espeak NG release.

In order to build the fairseq dependency, you are required to have Visual Studio and install the "Desktop development with C++" development package.

Usage

To use it, install poetry and install the requirements with poetry install --no-root and then download the Hubert model, Forward Tacotron model and any RVC model. You can then place them into the model folder with the corresponding names:

  • hubert_base.pt -> hubert.pt
  • forward_steps90k.pt -> forward.pt
  • (rvc .pth model name) -> rvc_model.pth
  • (rvc .index model name) -> rvc_index.index (optional)

Once you have all of these, you can run the RVG.py file with your desired arguments over CLI, run the file without any arguments to launch the Gradio WebUI or you can include this code in your own project and import the rvg_tts function from RVG.py.

Current feature set

  • RVC v1 and v2 model support
  • RVC Index support
  • Fast inference speed (~10 seconds on start and ~5 on consecutive runs with persistent mode on via importing)
  • Easy to use CLI

Todo

  • Support both RVC model versions
  • Create a proper importable package
  • Support calling from CLI
  • Further code condensing
  • Gradio WebUI
  • Multi-lang support

Other languages

In order to use a different language, a new forward tacotron model must be trained. This is something I cannot do without a dataset. This is where I ask the community for help. If you can provide a dataset, please do.

Credits

Forward Tacotron is licensed under the MIT License

RVC Webui is licensed under the MIT License

License

Copyright 2023 Foxify52

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

rvg_tts's People

Contributors

foxify52 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

marifl rpfilomeno

rvg_tts's Issues

what if you use a language other than English?

Thank you for providing information about using RVC training results to Tacotron2 TTS @Foxify52 .
I want to ask, what if I want to use a language other than English in Tacotron2 TTS. do you have any suggestions or references for that?

cuda only?

i am on windows and i have amd gpu, so i cant use cuda, is there a way to make it work with directml or using cpu?

Docker would be nice

Is your feature request related to a problem? Please describe.
Running on a cloud server would take a bit of setup.

Describe the solution you'd like
A docker that can be pulled instantly

Describe alternatives you've considered
Creating a script in jupyter to run on the cloud

Additional context
things like w-okada have a docker image. It would be great if this had one too.

Add installation guide for cuda problem

hi
(suggesting environment)
python -m venv hi
(install for all requirements except torch (cpu version)
pip install -r requirements
(install torch with cuda rightfully)
python -m pip install torch==2.0.1+cu118 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
(inference example)
python RVG.py --input_text "Hello i'm Senko, your friendly fox girl!"
(more cli options e.g. harvest) default crepe

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.