blaisewf / rvc_cli Goto Github PK
View Code? Open in Web Editor NEW๐ RVC + UVR = A perfect set of tools for voice cloning, easily and free!
Home Page: https://rvc-cli.pages.dev/
License: Other
๐ RVC + UVR = A perfect set of tools for voice cloning, easily and free!
Home Page: https://rvc-cli.pages.dev/
License: Other
env/python won't work on Windows after the install script. It may be confusing for some people.
Describe the bug
A clear and concise description of what the bug is.
Get the following error, whether overtrain_detector is set to true/false and whether overtrain_threshold is set to an integer value or not:
train.py: error: argument -ot/--overtraining_threshold: invalid int value: 'False'
To Reproduce
Steps to reproduce the behavior:
Run with the following options:
{python_path} main.py train --model_name {model_name} --sampling_rate 40000 --pitch_guidance True --gpu 1 --save_every_epoch 50 --save_only_latest True --overtraining_detector False
Expected behavior
A clear and concise description of what you expected to happen.
Training runs and completes
Assets
If applicable, add screenshots/videos to help explain your problem.
Desktop (please complete the following information):
Issue with Model Output Generating Noise
The model is training successfully, but when attempting to process a file, the output consists only of squeaks and noise. This issue persists even when using alternative models; the resulting audio remains distorted. Interestingly, utilizing a model trained in a different version of RVC yields normal functioning.
Another instance of RVC works fine in this machine.
What could be the underlying cause? Various output file formats have been experimented with to no avail.
Using VDS
I was trying to train a model using the OV2 pretraining model, and I came across a strange thing, for every 1 epoch only 1 step was generated
Example:
model_10_epoch_10_steps
model_100_epoch_100_steps
I am using the code on Kaggle, which uses conda with python 3.10
Sorry for the bad English, it's not my first language
I noticed a potential conflict with the file "rvc.py" and the folder named "rvc". To avoid confusion or issues, could we please rename either the file or the folder?
Describe the bug
Inference from a model I trained isn't working and just printing out "Error: 'config'" instead
To Reproduce
I used the colab in the repo, with the change that I made the main directory on my Google Drive so I don't have to pull every time and so the models I make are automatically saved.
I trained a model, and maybe this is the problem, with v2 and 40k sample rate. I later saw in the configs that there is no v2-40k config.
I then tried the inference there, and it didn't work out, it just spat out "Error: 'config'"
I traced it to the vc pipeline where it loads the checkpoint and tries to access ckpt['config'].
Then outside of the code, I loaded the checkpoints saved from my training and they do not have a 'config' key.
So I looked at the code that saves the checkpoints, and there is no 'config' key there either.
I think I'm doing something wrong, but I'm not sure what.
Expected behavior
Inference works
Describe the bug
A clear and concise description of what the bug is.
It looks like the value of args.protect isn't being converted properly to a float. I've tried converting the arg to a float directly but it produces errors related tot he tensor size in torch.from_numpy(npy).unsqueeze(0).to(self.device) * index_rate + (1 - index_rate) * feats. Fiddled with a bit but couldn't get anything worthwhile to come out.
.......................
To Reproduce
Steps to reproduce the behavior:
$ python3 main.py batch_infer --f0up_key 5 --filter_radius 4 --index_rate 0.9 --hop_length 128 --rms_mix_rate 1.0 --protect 0.4 --f0autotune True --f0method rmvpe --input_folder "/home/mb/Desktop/" --output_folder "/home/......................." --pth_path "/home/........................pth" --index_path "/home........................index" --export_format WAV
changing the value of protect doesn't seem to change the error.
Expected behavior
For the inference to happen correcty
Assets
If applicable, add screenshots/videos to help explain your problem.
Desktop (please complete the following information):
Additional context
Add any other context about the problem here.
I was running using this command in kaggle after installing and downloading required models
!python rvc_cli.py infer --input_path ./input/audioenhanced.wav --output_path ./output --pth_path ./assets --index_path ./logs
then it stops with
An error occurred during audio conversion: 'NoneType' object has no attribute 'pipeline'
Traceback (most recent call last):
File "/kaggle/working/rvc-cli/rvc/infer/infer.py", line 256, in convert_audio
audio_opt = self.vc.pipeline(
AttributeError: 'NoneType' object has no attribute 'pipeline'
Is I'm doing cli wrong ?
Thanks
Describe the bug
When i try to use 2 GPUs on Kaggle, this error occours:
[W socket.cpp:663] [c10d] The client socket has failed to connect to [localhost]:55292 (errno: 99 - Cannot assign requested address).
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Be able to use two GPUs for training.
Desktop (please complete the following information):
Additional context
Yes, I was using the 2x T4, and was using the last commit ("fix multi gpu")
Static (Robotic) noise in generated output. I even tried upto 3500 epochs but no success using the commandline.
However, when I use the gui, it works. I am not sure what the issue is.
#!/bin/bash
# Define variables
MODEL_NAME="MyVoiceModel"
VOICE_DATA_PATH="voice/myvoice.wav"
SAMPLE_RATE=48000
RVC_VERSION="v2"
HOP_LENGTH=256
F0METHOD="rmvpe"
TOTAL_EPOCHS=1000
BATCH_SIZE=16
GPU=0 # Adjust if you have multiple GPUs
SAVE_EVERY_EPOCH=10
TEXT_TO_SYNTHESIZE="This is a sample text for voice conversion."
# Step 1: Preprocess Dataset
echo "Preprocessing Dataset..."
python main.py preprocess "$MODEL_NAME" "$VOICE_DATA_PATH" $SAMPLE_RATE
# Step 2: Extract Features
echo "Extracting Features..."
python main.py extract "$MODEL_NAME" $RVC_VERSION $F0METHOD $HOP_LENGTH $SAMPLE_RATE
# Step 3: Train the Model
echo "Training the Model..."
python main.py train "$MODEL_NAME" $RVC_VERSION $SAVE_EVERY_EPOCH False True $TOTAL_EPOCHS $SAMPLE_RATE $BATCH_SIZE $GPU True False False
# Step 4: Generate Index File
echo "Generating Index File..."
python main.py index "$MODEL_NAME" $RVC_VERSION
# Step 5: Voice Conversion Inference (Modify paths to the model and index files as needed)
echo "Performing Voice Conversion Inference..."
python main.py infer "$TEXT_TO_SYNTHESIZE" "$MODEL_NAME" 0 5 0.5 $HOP_LENGTH "$F0METHOD" "output_tts.wav" "output_rvc.wav" "path_to_trained_model/$MODEL_NAME.pth" "path_to_index_file/$MODEL_NAME.index"
echo "Voice Conversion Process Completed."
Can anyone help please:
I'm trying to generate a new pth and got follow error:
"FileNotFoundError: [Errno 2] No such file or directory: '/home/user/RVC_CLI/logs/mute/sliced_audios/mute40000.wav'"
Step To get a error:
python3 rvc.py preprocess --model_name "johnnyc" --dataset_path "../models/sample/" --sampling_rate "40000"
python3 rvc.py extract --model_name "johnnyc" --rvc_version "v2" --sampling_rate "40000"
python3 rvc.py train --model_name "johnnyc" --rvc_version "v2" --save_every_epoch "3" --sampling_rate "40000"
Process Process-1:
Traceback (most recent call last):
File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/user/RVC_CLI/rvc/train/train.py", line 215, in run
train_dataset = TextAudioLoaderMultiNSFsid(hps.data)
File "/home/user/RVC_CLI/rvc/train/data_utils.py", line 21, in __init__
self._filter()
File "/home/user/RVC_CLI/rvc/train/data_utils.py", line 29, in _filter
lengths.append(os.path.getsize(audiopath) // (3 * self.hop_length))
File "/usr/lib/python3.10/genericpath.py", line 50, in getsize
return os.stat(filename).st_size
FileNotFoundError: [Errno 2] No such file or directory: '/home/user/RVC_CLI/logs/mute/sliced_audios/mute40000.wav'
Saved index file '/home/user/RVC_CLI/logs/johnnyc/added_IVF94_Flat_nprobe_1_v2.index'
Normal inference works, but when I try API, with or without host/port arguments, I get an error:
\RVC_CLI> ./env/python.exe main.py api
Error: [WinError 2] The system cannot find the file specified
First of all, thank you, this is the first cli for rvc that actually works!! I've been trying all kinds of solutions. Below is a minor enhancement you could make.
The following error is experienced when inferencing on apple silicon:
The operator 'aten::_fft_r2c' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on pytorch/pytorch#77764. As a temporary fix, you can set the environment variable PYTORCH_ENABLE_MPS_FALLBACK=1
to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
Voice conversion failed: cannot unpack non-iterable NoneType object
Setting the mps fallback as mentioned works but could be handled in your code.
script works fine
feature requests
a) I use RVC and use GPU for inference, can you enable it in cli as well
b) can the temporary files be kept inside a folder say temp on projects, making it easier for housekeeping
thanks
Senthil
Hi,
I am generating songs using the inference code. The problem is that the output just contains the vocals and I would like to join the instrumentals too.
Could you give me a hand on this?
Thanks!
Hi,
I am trying to make this work as a docker container too, but can't really get it to work...
Maybe there is already a Dockerfile out there?
If not, this is my current Dockerfile (currently I wanted to test inference first, so I copied my models in):
FROM nvidia/cuda:11.6.2-cudnn8-runtime-ubuntu20.04
# Create a working directory
WORKDIR /app
# Install dependenceis to add PPAs and git
RUN apt-get update && \
apt-get install -y -qq ffmpeg aria2 && apt clean && \
apt-get install -y software-properties-common && \
apt-get install -y git && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
# Add the deadsnakes PPA to get Python 3.9
RUN add-apt-repository ppa:deadsnakes/ppa
# Clone the repository
RUN git clone https://github.com/blaise-tk/RVC_CLI.git
# Set the working directory to the cloned repo
WORKDIR /app/RVC_CLI
# Install Python 3.9 and pip
RUN apt-get update && \
apt-get install -y build-essential python-dev python3-dev python3.9-distutils python3.9-dev python3.9 curl && \
apt-get clean && \
update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.9 1 && \
curl https://bootstrap.pypa.io/get-pip.py | python3.9
# Set Python 3.9 as the default
RUN update-alternatives --install /usr/bin/python python /usr/bin/python3.9 1
# Install Python dependencies
RUN chmod +x install.sh
RUN ./install.sh
# create init.py file in app/RVC_CLI/rvc folder
RUN touch /app/RVC_CLI/rvc/__init__.py
# Download prerequisites
RUN python rvc.py prerequisites --pretraineds_v1 True --pretraineds_v2 True --models True --exe True
# Copy the audio file into the container
COPY audio.mp3 /app/RVC_CLI/audio.mp3
COPY Jari.pth /app/RVC_CLI/Jari.pth
COPY Jari.index /app/RVC_CLI/Jari.index
# Set the entrypoint to keep the container running
CMD ["tail", "-f", "/dev/null"]
It builds, but when I run infer.py eg with this:
python rvc.py infer --f0up_key 0 --filter_radius 3 --index_rate 0.3 --hop_length 128 --rms_mix_rate 1.0 --protect 0.3 --f0autotune False --f0method rmvpe --input_path /app/RVC_CLI/audio.mp3 --output_path /output/audio_out.mp3 --pth_path /app/RVC_CLI/Jari.pth --index_path /app/RVC_CLI/Jari.index
I don't get an output file.
I am pretty new to Docker so maybe someone more experienced could figure a good setup out quite quickly :)
And I think this would make the repo even easier to use for the average user.
The voice is getting distorted when chunks are made resulting in robotic voice in output
Setting to True or False always results on in the same output.wav? how can I get the voice swapped output combined with the original instrumental ? Thanks. This one has been the easiest to use tool so far.
Describe the bug
During infer an error/warning is reported, " '<' not supported between instances of 'str' and 'float' "
and no output file is written despite the output saying it is.
To Reproduce
Steps to reproduce the behavior:
Run the command
PS M:\LLMs\tts\RVC_CLI> .\env\python.exe main.py infer `
>> --index_path '.\rvcs\test0.index' `
>> --pth_path '.\rvcs\test0.pth' `
>> --input_path '.\output.wav' `
>> --output_path 'M:\LLMs\tts\RVC_CLI\output-rvc.wav'
<All keys matched successfully>
'<' not supported between instances of 'str' and 'float'
Conversion completed. Output file: 'M:\LLMs\tts\RVC_CLI\output-rvc.wav' in 2.22 seconds.
PS M:\LLMs\tts\RVC_CLI>
Expected behavior
An output file processed with the supplied pth/index
Assets
If applicable, add screenshots/videos to help explain your problem.
Desktop (please complete the following information):
Windows 11
Additional context
If I checkout tag 1.1.2 it all works
No matter what I try for the hop length value with or without double quotes/single quotes, it wont work. It became very frustrating.
python main.py infer --f0up_key "0" --filter_radius "5" --index_rate "0.5" --hop_length "256" --f0method "dio" --input_path "input.wav" --output_path "output.wav" --pth_file "model.pth" --index_path "model.index" --split_audio "False" --f0autotune "False"
it was better before you changed the arguments. Atleast it worked and it was robotic. but now I am totally unable to use it.
I'm having some issues with the API call (internal server error) - I'm assuming its the syntax of the JSON at this point, ive messed around a bit but keeps returning in error. Here is how the Json is syntaxed atm:
{
"f0up_key": 0,
"filter_radius": 5,
"index_rate": 0.5,
"hop_length": 256,
"f0method": "rmvpe",
"input_path": "D:\Projects\VoiceChangerAI\TestFile\testa.wav",
"output_path": "D:\Projects\VoiceChangerAI\TestFile\output.wav",
"pth_file": "LB.pth",
"index_path": "LB.index",
"split_audio": false,
}
have "LB.pth" and the index in the "RVC_CLI\models" folder currently?
Thanks for any help - I'm total narb with this stuff >_<
Hi,
Now it's a lot better than before. The parameters work well and the quality is better than before.
However, in some places, it makes the voice sound like an old grand mother struggling to speak.
Here's my command:
python main.py infer --f0up_key "2" --filter_radius 5 --index_rate "0.1" --hop_length "25" --f0method "dio" --input_path "input.wav" --output_path "output.wav" --pth_path "rvcfinalv4-harvest-1000epochs.pth" --index_path "rvcfinalv4-harvest-1000epochs.index" --split_audio "False" --f0autotune "False"
Let me know if we can do anything to improve the voice quality.
Once again, your work is great in the RVC commandline space. Yours is the best commandline tool for RVC, better than all the ones even released by the original RVC project. So, Thank you.
Code Invocation:
curl --location 'http://127.0.0.1:8000/infer' \
--header 'Content-Type: application/json' \
--data '{
"f0up_key": 0,
"filter_radius": 2,
"index_rate": 0.5,
"hop_length": 256,
"rms_mix_rate": 0.5,
"protect": 0.5,
"f0autotune": false,
"f0method": "rmvpe",
"input_path": "/home/.../RVC_CLI/input/018b3ee3-50a3-7b40-8b02-c99d3753a8a4.mp3",
"output_path": "/home/.../RVC_CLI/output/1.wav",
"pth_path": "/home/.../RVC_CLI/logs/Alisa/Alisa.pth",
"index_path": "/home/.../RVC_CLI/logs/Alisa/added_IVF757_Flat_nprobe_1_Alisa_v2.index",
"split_audio": false,
"clean_audio": false,
"clean_strength": 0.5,
"export_format": "WAV"
}
Result:
{
"output": "<All keys matched successfully>\nConversion completed. Output file: '/home/.../RVC_CLI/output/1.wav' in 4.14 seconds.\n",
"error": "/home/.../anaconda3/envs/rvc_cli/lib/python3.9/site-packages/torch/nn/utils/weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.\n warnings.warn(\"torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.\")\n"
}
However, the file does not appear in the output folder.
Moreover, to make it work, I changed:
@app.post("/infer")
async def infer(request: Request):
command = ["python", "main.py", "infer"]
json_data = await request.json()
command += [f"--{key}={value}" for key, value in json_data.items()]
return execute_command(command)
Also, it would be good to format the output as follows:
When the status code is 200:
{"audio_content": "path/name.wav", "message": "..."}
For other status codes:
{"message": "...", "error": "..."}
Bug Description
i ran the install.bat, now i am trying to run env/python.exe rvc.py or python rvc.py prerequisites, also one thing when i executed the bat file , a;; the dependencies installed in my whole local system
File "C:\Users\ESHAN\Desktop\rvctest\rvc.py", line 10, in
from rvc.configs.config import Config
File "C:\Users\ESHAN\Desktop\rvctest\rvc.py", line 10, in
from rvc.configs.config import Config
ModuleNotFoundError: No module named 'rvc.configs'; 'rvc' is not a package
Desktop Details:
-windows 11, nvdia gtx 1650
Hey guys,
I was able to get the server started and configured to work, tts is working too after changing locale to shortname.
So im getting tts output but rvc output is just interference, like a continuous beep. also when performing inference, it works well on the original and the fork with the same model, do I need special type of model here?
tried both api and cli.
I've got Applio running on my M2 Max Mac Studio but Batch Conversion is not working. To get further information I cloned this git here and tried the CLI batch conversion, which also does not work. Single conversion works fine with CLI and Applio.
This is my single conversion cmd, which results in a working file:
python main.py infer --f0up_key "0" --filter_radius "3" --index_rate "0.8" --hop_length "64" --split_audio "True" --f0autotune "False" --f0method "rmvpe" --input_path "/Users/liam/Music/RVC/city_of_angels/hmmmh.wav" --output_path "/Users/liam/Downloads/test/test.wav" --pth_path "/Applications/RVC_Applio/logs/40k_natedogg_super/40k_natedogg_super.pth" --index_path "/Applications/RVC_Applio/logs/40k_natedogg_super/40k_natedogg_super_clean.index"
This is batch-conversion, which results in an error, no matter if rms_mix_rate and other parameters are included in the cmd or not:
python main.py batch_infer --f0up_key "0" --filter_radius "3" --index_rate "0.8" --hop_length "64" --split_audio "True" --f0autotune "False" --f0method "rmvpe" --input_folder "/Users/liam/Music/RVC/love_me_down/ValYoung" --output_folder "/Users/liam/Downloads/test" --pth_path "/Applications/RVC_Applio/logs/40k_natedogg_super/40k_natedogg_super.pth" --index_path "/Applications/RVC_Applio/logs/40k_natedogg_super/40k_natedogg_super_clean.index" --rms_mix_rate "0.0"
The conversion fails with the following error:
Inferring /Users/liam/Music/RVC/love_me_down/ValYoung/Ladada_1.wav.wav...
No supported Nvidia GPU found
Traceback (most recent call last):
File "/Users/liam/Downloads/RVC_CLI/rvc/infer/infer.py", line 229, in <module>
rms_mix_rate = float(sys.argv[12])
ValueError: could not convert string to float: 'True'
Seems like rms_mix_rate=True is sneaking in somewhere, and resulting in an error when converted to float. But where is it coming from? I removed all arguments that use True/False from the cmd, but it still ends up with this error.
still having issues with the API. The main infer worked with the same input as the Json bellow. I tried messing around with the format a bit but no luck.
JSON:
{
"f0up_key": "2",
"filter_radius": "5",
"index_rate": "0.1",
"hop_length": "25",
"f0method": "dio",
"input_path": "D:\Projects\VoiceChangerAI\TestFile\testa.wav",
"output_path": "D:\Projects\VoiceChangerAI\TestFile\output_API.wav",
"pth_path": "C:\Users\KCLEE\Documents\GitHub\models\LenvalBrown.pth",
"index_path": "C:\Users\KCLEE\Documents\GitHub\models\LenvalBrown.index",
"split_audio": "false",
"f0autotune": "false"
}
the console spits out this:
Traceback (most recent call last):
File "C:\Users\KCLEE\Documents\GitHub\RVC_CLI\main.py", line 953, in
main()
File "C:\Users\KCLEE\Documents\GitHub\RVC_CLI\main.py", line 947, in main
run_api_script()
File "C:\Users\KCLEE\Documents\GitHub\RVC_CLI\main.py", line 385, in run_api_script
subprocess.run(command)
File "C:\Users\KCLEE\Documents\GitHub\RVC_CLI\env\lib\subprocess.py", line 507, in run
stdout, stderr = process.communicate(input, timeout=timeout)
File "C:\Users\KCLEE\Documents\GitHub\RVC_CLI\env\lib\subprocess.py", line 1126, in communicate
self.wait()
File "C:\Users\KCLEE\Documents\GitHub\RVC_CLI\env\lib\subprocess.py", line 1189, in wait
return self._wait(timeout=timeout)
File "C:\Users\KCLEE\Documents\GitHub\RVC_CLI\env\lib\subprocess.py", line 1486, in _wait
result = _winapi.WaitForSingleObject(self._handle,
Client side I get a "error 400 - bad request"
Hello, I didnโt find any other contacts, so Iโll write the problem here.
I'm trying to write an application in C# (WinForms) using your solution. I am weak in programming, outside the zone of simple C# applications, but I was interested in the functionality of the RVC library. I want to try to establish interaction with the RVC by organizing a server and sending requests to it from clients from the global network. But I encountered a problem when the API server is launched on a local machine, where there is a suitable video card for work, but on the machine with which the application is being developed - there is no. And to establish communication with my โserverโ where your solution will be deployed, I tried to change the server launch parameters to a local IP (192.168.x.x) and another port, but failed.
Can I somehow use launch parameters (for example the main.py api [-ip or -port] file) to change the parameters of the API server (uvicorn server)?
If not, is it possible to add such functionality to the "main.py api" command?
So first things first, infer section looks alright. moving on to training u guys can def improve some stuff on there:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.