Git Product home page Git Product logo

vits-fast-fine-tuning's Introduction

中文文档请点击这里

VITS Fast Fine-tuning

This repo will guide you to add your own character voices, or even your own voice, into an existing VITS TTS model to make it able to do the following tasks in less than 1 hour:

  1. Any-to-any voice conversion between you & any characters you added & preset characters
  2. English, Japanese & Chinese Text-to-Speech synthesis with the characters you added & preset characters

Welcome to play around with the base model, a Trilingual Anime VITS! Hugging Face Spaces

Currently Supported Tasks:

  • Convert user's voice to characters listed here
  • Chinese, English, Japanese TTS with user's voice
  • Chinese, English, Japanese TTS with custom characters!

Currently Supported Characters for TTS & VC:

  • Umamusume Pretty Derby (Used as base model pretraining)
  • Sanoba Witch (Used as base model pretraining)
  • Genshin Impact (Used as base model pretraining)
  • Any character you wish as long as you have their voices!

Fine-tuning

It's recommended to perform fine-tuning on Google Colab because the original VITS has some dependencies that are difficult to configure.

How long does it take?

  1. Install dependencies (2 min)
  2. Record at least 20 your own voice, the content to read will be presented in UI, less than 20 words per sentence. (5~10 min)
  3. Upload your character voices, which should be a .zip file, it's file structure should be like:
Your-zip-file.zip
├───Character_name_1
├   ├───xxx.wav
├   ├───...
├   ├───yyy.mp3
├   └───zzz.wav
├───Character_name_2
├   ├───xxx.wav
├   ├───...
├   ├───yyy.mp3
├   └───zzz.wav
├───...
├
└───Character_name_n
    ├───xxx.wav
    ├───...
    ├───yyy.mp3
    └───zzz.wav

Note that the format & name of the audio files does not matter as long as they are audio files.
Audio quality requirements: >=2s, <=20s per audio, background noise should be as less as possible. Audio quantity requirements: at least 10 per character, better if 20+ per character.
You can either choose to perform step 2, 3, or both, depending on your needs.

  1. Fine-tune (30 min)
    After everything is done, download the fine-tuned model & model config

Inference or Usage (Currently support Windows only)

  1. Remember to download your fine-tuned model!
  2. Download the latest release
  3. Put your model & config file into the folder inference, make sure to rename the model to G_latest.pth and config file to finetune_speaker.json
  4. The file structure should be as follows:
inference
├───inference.exe
├───...
├───finetune_speaker.json
└───G_latest.json
  1. run inference.exe, the browser should pop up automatically.

vits-fast-fine-tuning's People

Contributors

plachtaa avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.