Git Product home page Git Product logo

parrots's Introduction

🇨🇳中文 | 🌐English | 📖文档/Docs | 🤖模型/Models


Parrots: ASR and TTS toolkit

PyPI version Downloads Contributions welcome GitHub contributors License Apache 2.0 python_vesion GitHub issues Wechat Group

Introduction

Parrots, Automatic Speech Recognition(ASR), Text-To-Speech(TTS) toolkit, support Chinese, English, Japanese, etc.

parrots实现了语音识别和语音合成模型一键调用,开箱即用,支持中英文。

Features

  1. ASR:基于distilwhisper实现的中文语音识别(ASR)模型,支持中、英等多种语言
  2. TTS:基于GPT-SoVITS训练的语音合成(TTS)模型,支持中、英、日等多种语言

Install

pip install torch # or conda install pytorch
pip install -r requirements.txt
pip install parrots

or

pip install torch # or conda install pytorch
git clone https://github.com/shibing624/parrots.git
cd parrots
python setup.py install

Demo

run example: examples/tts_gradio_demo.py to see the demo:

python examples/tts_gradio_demo.py

Usage

ASR(Speech Recognition)

example: examples/demo_asr.py

import os
import sys

sys.path.append('..')
from parrots import SpeechRecognition

pwd_path = os.path.abspath(os.path.dirname(__file__))

if __name__ == '__main__':
    m = SpeechRecognition()
    r = m.recognize_speech_from_file(os.path.join(pwd_path, 'tushuguan.wav'))
    print('[提示] 语音识别结果:', r)

output:

{'text': '北京图书馆'}

TTS(Speech Synthesis)

example: examples/demo_tts.py

import sys
sys.path.append('..')
import parrots
from parrots import TextToSpeech
parrots_path = parrots.__path__[0]
sys.path.append(parrots_path)

m = TextToSpeech(
    speaker_model_path="shibing624/parrots-gpt-sovits-speaker-maimai",
    speaker_name="MaiMai",
)
m.predict(
    text="你好,欢迎来北京。welcome to the city.",
    text_language="auto",
    output_path="output_audio.wav"
)

output:

Save audio to output_audio.wav

命令行模式(CLI)

支持通过命令行方式执行ARS和TTS任务,代码:cli.py

> parrots -h                                    

NAME
    parrots

SYNOPSIS
    parrots COMMAND

COMMANDS
    COMMAND is one of the following:

     asr
       Entry point of asr, recognize speech from file

     tts
       Entry point of tts, generate speech audio from text

run:

pip install parrots -U
# asr example
parrots asr -h
parrots asr examples/tushuguan.wav

# tts example
parrots tts -h
parrots tts "你好,欢迎来北京。welcome to the city." output_audio.wav
  • asrtts是二级命令,asr是语音识别,tts是语音合成,默认使用的模型是中文模型
  • 各二级命令使用方法见parrots asr -h
  • 上面示例中examples/tushuguan.wavasr方法的audio_file_path参数,输入的音频文件(required)

Release Models

ASR

TTS

speaker name 说话人名 character 角色特点 language 语言
KuileBlanc 葵·勒布朗 lady 标准美式女声 en
LongShouRen 龙守仁 gentleman 标准美式男声 en
MaiMai 卖卖 singing female anchor 唱歌女主播声 zh
XingTong 星瞳 singing ai girl 活泼女声 zh
XuanShen 炫神 game male anchor 游戏男主播声 zh
KusanagiNene 草薙寧々 loli 萝莉女学生声 ja
speaker name 说话人名 character 角色特点 language 语言
MaiMai 卖卖 singing female anchor 唱歌女主播声 zh

Contact

  • Issue(建议):GitHub issues
  • 邮件我:xuming: [email protected]
  • 微信我:加我微信号:xuming624, 进Python-NLP交流群,备注:姓名-公司名-NLP

Citation

如果你在研究中使用了parrots,请按如下格式引用:

@misc{parrots,
  title={parrots: ASR and TTS Tool},
  author={Ming Xu},
  year={2024},
  howpublished={\url{https://github.com/shibing624/parrots}},
}

License

授权协议为 The Apache License 2.0,可免费用做商业用途。请在产品说明中附加parrots的链接和授权协议。

Contribute

项目代码还很粗糙,如果大家对代码有所改进,欢迎提交回本项目,在提交之前,注意以下两点:

  • tests添加相应的单元测试
  • 使用python -m pytest来运行所有单元测试,确保所有单测都是通过的

之后即可提交PR。

Reference

ASR(Speech Recognition)

TTS(Speech Synthesis)

parrots's People

Contributors

daxiongpro avatar nuck555 avatar shibing624 avatar sonictl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

parrots's Issues

AttributeError: Can't get attribute 'HParams' on <module 'utils' from 'C:\\..\\Python39\\lib\\site-packages\\utils\\__init__.py'>

我根據您的操作步驟出現Can't get attribute 'HParams'的問題
我也確定 examples/tts_gradio_demo.py 有加上 sys.path.append(parrots_path)

python examples/tts_gradio_demo.py

2024-05-20 15:17:29.265 | DEBUG | parrots.tts:init:305 - Use device: cuda
C:\Users\88692\AppData\Local\Programs\Python\Python39\lib\site-packages\huggingface_hub\file_download.py:1132: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.
warnings.warn(
2024-05-20 15:17:33.210 | INFO | parrots.tts:init:319 - Load pretrained parrots speaker: shibing624/parrots-gpt-sovits-speaker-maimai
Fetching 6 files: 100%|█████████████████████████████| 6/6 [00:00<?, ?it/s]
2024-05-20 15:17:33.465 | DEBUG | parrots.tts:init:332 - Reference speaker config: {'reference_wav': 'ref.wav', 'speaker': 'MaiMai', 'character': 'singing female anchor', 'reference_language': 'zh', 'reference_prompt': '那我们,唠也唠了这么久了唠了有十几分钟了我们要不来唱唱,唱唱歌,想听什么 ,今天想听什么。'}, loaded from C:\Users\88692.cache\huggingface\hub\models--shibing624--parrots-gpt-sovits-speaker-maimai\snapshots\369f6de40db8590be8eb1627d7f55fbbdb4fa63b\MaiMai\config.json
Traceback (most recent call last):
File "C:\Users\88692\Desktop\code\myself\parrots\test.py", line 8, in
m = TextToSpeech(
File "C:\Users\88692\Desktop\code\myself\parrots\parrots\tts.py", line 342, in init
sovits_dict = torch.load(sovits_model_path, map_location="cpu")
File "C:\Users\88692\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\serialization.py", line 1026, in load
return _load(opened_zipfile,
File "C:\Users\88692\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\serialization.py", line 1438, in _load
result = unpickler.load()
File "C:\Users\88692\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\serialization.py", line 1431, in find_class
return super().find_class(mod_name, name)
AttributeError: Can't get attribute 'HParams' on <module 'utils' from 'C:\Users\88692\AppData\Local\Programs\Python\Python39\lib\site-packages\utils\init.py'>

TypeError: _sanitize_parameters() got an unexpected keyword argument 'low_cpu_mem_usage'

Describe the Question

Please provide a clear and concise description of what the que---------------------------------------------------------------------------

运行例子时遇到了这个错误,应该如何解决

TypeError Traceback (most recent call last)
Cell In[1], line 5
1 from parrots import SpeechRecognition
4 if name == 'main':
----> 5 m = SpeechRecognition('/app/pretrained_models/Belle-distilwhisper-large-v2-zh', low_cpu_mem_usage=False)
6 r = m.recognize_speech_from_file('./output.wav')
7 print('[提示] 语音识别结果:', r)

File /app/project/ASR-TTS/parrots/asr.py:81, in SpeechRecognition.init(self, model_name_or_path, use_cuda, cuda_device, max_new_tokens, chunk_length_s, batch_size, torch_dtype, use_flash_attention_2, language, **kwargs)
78 self.model.to(self.device)
80 self.processor = AutoProcessor.from_pretrained(model_name_or_path)
---> 81 self.pipe = pipeline(
82 "automatic-speech-recognition",
83 model=self.model,
84 tokenizer=self.processor.tokenizer,
85 feature_extractor=self.processor.feature_extractor,
86 device=self.device,
87 torch_dtype=torch_dtype,
88 max_new_tokens=max_new_tokens,
89 batch_size=batch_size,
90 chunk_length_s=chunk_length_s,
91 **kwargs
92 )
93 if language == 'zh':
94 self.pipe.model.config.forced_decoder_ids = (
95 self.pipe.tokenizer.get_decoder_prompt_ids(
96 language=language,
97 task="transcribe"
98 )
99 )

File ~/miniconda3/envs/speech_recognition/lib/python3.9/site-packages/transformers/pipelines/init.py:1108, in pipeline(task, model, config, tokenizer, feature_extractor, image_processor, framework, revision, use_fast, token, device, device_map, torch_dtype, trust_remote_code, model_kwargs, pipeline_class, **kwargs)
1105 if device is not None:
1106 kwargs["device"] = device
-> 1108 return pipeline_class(model=model, framework=framework, task=task, **kwargs)

File ~/miniconda3/envs/speech_recognition/lib/python3.9/site-packages/transformers/pipelines/automatic_speech_recognition.py:220, in AutomaticSpeechRecognitionPipeline.init(self, model, feature_extractor, tokenizer, decoder, device, torch_dtype, **kwargs)
217 else:
218 self.type = "ctc"
--> 220 super().init(model, tokenizer, feature_extractor, device=device, torch_dtype=torch_dtype, **kwargs)

File ~/miniconda3/envs/speech_recognition/lib/python3.9/site-packages/transformers/pipelines/base.py:894, in Pipeline.init(self, model, tokenizer, feature_extractor, image_processor, modelcard, framework, task, args_parser, device, torch_dtype, binary_output, **kwargs)
892 self._batch_size = kwargs.pop("batch_size", None)
893 self._num_workers = kwargs.pop("num_workers", None)
--> 894 self._preprocess_params, self._forward_params, self._postprocess_params = self._sanitize_parameters(**kwargs)
896 # Pipelines calling generate: if the tokenizer has a pad token but the model doesn't, set it in the
897 # forward params so that generate is aware of the pad token.
898 if (
899 self.tokenizer is not None
900 and self.model.can_generate()
901 and self.tokenizer.pad_token_id is not None
902 and self.model.generation_config.pad_token_id is None
903 ):

TypeError: _sanitize_parameters() got an unexpected keyword argument 'low_cpu_mem_usage'stion is.

语音转文字识别率低

环境:
Windows 10 专业版

问题:
安装环境之后,使用example中存在的例子和个人素材进行demo:

example :
image

个人素材也是同样识别出第一个音,后面就没有了。

目的:
想请教大佬们,目前转化的准确率是存在问题,后面能进一步提高嘛?

AttributeError

import parrots
text = parrots.speech_recognition_from_file('./16k.wav')
Traceback (most recent call last):
File "", line 1, in
AttributeError: module 'parrots' has no attribute 'speech_recognition_from_file'

How to solve this problem? Thanks.

这tts是需要联网在线服务吗

Describe the bug

Please provide a clear and concise description of what the bug is. If applicable, add screenshots to help explain your problem, especially for visualization related problems.

运行官方示例,没有声音输出

最近一直想如何将文字转换为语音,找到这个项目。 首先感谢作者的付出,但是我使用的时候有点问题。

测试代码为:

import sys

sys.path.append('..')
from parrots import TextToSpeech

if __name__ == '__main__':
    m = TextToSpeech()
    # say text
    m.speak('北京图书馆')

输出为:

2023-03-08 21:39:16.605 | DEBUG    | parrots.tts:speak:66 - ['bei3', 'jing1', 'tu2', 'shu1', 'guan3']

但是没有声音播放。在windows平台,测试其它的文本转语音项目,可以输出声音。

说话方式太机械化了

从试用体验来看,当面的文字转语音太机械化了,基本是按照相同的时间间隔来吐词。大佬有没有考虑利用深度学习技术使得语气更加的拟人化?

keras库版本?

请问这个报错可以怎么解决啊?是我的keras库版本太低?还是?
报错信息:
Traceback (most recent call last):
File "paddle_asr.py", line 25, in
test_parrots("/data/wav_ocr/2022103000000012/")
File "paddle_asr.py", line 22, in test_parrots
r = m.recognize_speech_from_file(input_path+wav)
File "/root/anaconda3/envs/noise_env/lib/python3.6/site-packages/parrots/asr.py", line 197, in recognize_speech_from_file
return self.recognize_speech(signal, fs)
File "/root/anaconda3/envs/noise_env/lib/python3.6/site-packages/parrots/asr.py", line 178, in recognize_speech
self.check_initialized()
File "/root/anaconda3/envs/noise_env/lib/python3.6/site-packages/parrots/asr.py", line 66, in check_initialized
self.initialize()
File "/root/anaconda3/envs/noise_env/lib/python3.6/site-packages/parrots/asr.py", line 53, in initialize
self._model.load_weights(self.model_path)
File "/root/anaconda3/envs/noise_env/lib/python3.6/site-packages/tensorflow/python/keras/engine/network.py", line 1516, in load_weights
saving.load_weights_from_hdf5_group(f, self.layers)
File "/root/anaconda3/envs/noise_env/lib/python3.6/site-packages/tensorflow/python/keras/engine/saving.py", line 772, in load_weights_from_hdf5_group
original_keras_version = f.attrs['keras_version'].decode('utf8')
AttributeError: 'str' object has no attribute 'decode'
简单代码调用用来做性能测试:
from parrots import SpeechRecognition, Pinyin2Hanzi
import time
start_time=time.time()
#m = SpeechRecognition()
#n = Pinyin2Hanzi()
def test_parrots(input_path):
m = SpeechRecognition()
n = Pinyin2Hanzi()
for wav in os.listdir(input_path):
if wav.endswith(".wav"):
r = m.recognize_speech_from_file(input_path+wav)
text = n.pinyin_2_hanzi(r)
print("parrots-ocr-finished")
test_parrots("/data/wav_ocr/2022103000000012/")
end_time=time.time()
print(end_time-start_time)

distil-whisper 中文支持?效果能用?

Describe the bug

Please provide a clear and concise description of what the bug is. If applicable, add screenshots to help explain your problem, especially for visualization related problems.

pretrained model

Great job!Thanks for your sharing!Where is the pretraind model?And the syllables.zip file can not be available.Looking forward to your reply!

Can't get attribute 'HParams' on <module 'utils'

args: Namespace(speaker_model='shibing624/parrots-gpt-sovits-speaker-maimai', speaker_name='MaiMai', device='cpu', half=False, text='你好,欢迎来北京。welcome to the city.', lang='auto', output_path='output_audio.wav')
2024-03-17 01:11:46.818 | DEBUG | parrots.tts:init:302 - Use device: cpu
2024-03-17 01:11:49.862 | INFO | parrots.tts:init:316 - Load pretrained parrots speaker: shibing624/parrots-gpt-sovits-speaker-maimai
Fetching 6 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 62601.55it/s]
2024-03-17 01:11:50.537 | DEBUG | parrots.tts:init:329 - Reference speaker config: {'reference_wav': 'ref.wav', 'speaker': 'MaiMai', 'character': 'singing female anchor', 'reference_language': 'zh', 'reference_prompt': '那我们,唠也唠了这么久了唠了有十几分钟了我们要不来唱唱,唱唱歌,想听什么,今天想听什么。'}, loaded from /Users/kevinlinpr/.cache/huggingface/hub/models--shibing624--parrots-gpt-sovits-speaker-maimai/snapshots/369f6de40db8590be8eb1627d7f55fbbdb4fa63b/MaiMai/config.json
Traceback (most recent call last):
File "/Users/kevinlinpr/AI-Waifu-Vtuber/parrots_test.py", line 23, in
m = TextToSpeech(
^^^^^^^^^^^^^
File "/opt/homebrew/anaconda3/lib/python3.11/site-packages/parrots/tts.py", line 339, in init
sovits_dict = torch.load(sovits_model_path, map_location="cpu")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/anaconda3/lib/python3.11/site-packages/torch/serialization.py", line 1026, in load
return _load(opened_zipfile,
^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/anaconda3/lib/python3.11/site-packages/torch/serialization.py", line 1438, in _load
result = unpickler.load()
^^^^^^^^^^^^^^^^
File "/opt/homebrew/anaconda3/lib/python3.11/site-packages/torch/serialization.py", line 1431, in find_class
return super().find_class(mod_name, name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: Can't get attribute 'HParams' on <module 'utils' (<_frozen_importlib_external.NamespaceLoader object at 0x2d761c4d0>)>

关于直接使用默认模型生成效果问题

Describe the Question

Please provide a clear and concise description of what the question is.

群主您好,我用单独再训练的模型放进去,效果跟gpt-sovits的那个基本是一样的,效果也非常好。
但有个问题是,我在测试gpt-sovits这个默认模型的时候,gpt-sovits的效果很好,
但是在parrots引入用gpt-sovits的默认模型,效果却不一样,不知道是否是代码还需要在哪里完善呢?
sovits_model_path 和gpt_model_path都是默认模型地址
m = TextToSpeech(
bert_model_path = pwd_path+"/models/gpts_pretrained_models/chinese-roberta-wwm-ext-large",
hubert_model_path = pwd_path+"/models/gpts_pretrained_models/chinese-hubert-base",
sovits_model_path = sovits_model_path,
gpt_model_path = gpt_model_path,
speaker_model_path = "usermodels",
speaker_name = "{username}".format(username=username),
device = 'cuda',
half = True,
)

下载问题

rying to resume download...
pytorch_model.bin: 19%|██████████ | 126M/651M [01:30<05:12, 1.68MB/s]
pytorch_model.bin: 26%|█████████████▋ | 168M/651M [01:33<22:34, 357kB/s]
请问这个可以提起下载吗,怎么操作呢

module 'tensorflow' has no attribute 'get_default_graph'

"C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\python.exe" "C:\Program Files\JetBrains\PyCharm Community Edition 2020.1.1\plugins\python-ce\helpers\pydev\pydevd.py" --multiproc --qt-support=auto --client 127.0.0.1 --port 57254 --file C:/Users/16413/Documents/GitHub/LostXmas/seq2seq/data/mining/SpeechRec/sr.py
pydev debugger: process 64336 is connecting

Connected to pydev debugger (build 201.7846.77)
2020-07-10 18:18:36.025768: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
Using TensorFlow backend.
C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\site-packages\pydub\utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
  warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
2020-07-10 18:18:42,676 - C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\site-packages\parrots\pinyin2hanzi.py - DEBUG - Loaded: C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\site-packages\parrots\data\pinyin2hanzi\pinyin_hanzi_dict.txt, size: 1421
2020-07-10 18:18:42,676 - C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\site-packages\parrots\pinyin2hanzi.py - DEBUG - Loaded: C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\site-packages\parrots\data\pinyin2hanzi\char_idx.txt, size: 5832
2020-07-10 18:18:43,380 - C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\site-packages\parrots\pinyin2hanzi.py - DEBUG - Loaded: C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\site-packages\parrots\data\pinyin2hanzi\word_idx.txt, size: 568646
2020-07-10 18:18:43,630 - C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\site-packages\parrots\pinyin2hanzi.py - DEBUG - Loaded: C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\site-packages\parrots\data\pinyin2hanzi\dic_pinyin.txt, size: 96117
Backend TkAgg is interactive backend. Turning interactive mode on.
2020-07-10 18:18:46.081700: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-07-10 18:18:47.311791: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce RTX 2060 computeCapability: 7.5
coreClock: 1.335GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 312.97GiB/s
2020-07-10 18:18:47.312512: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-07-10 18:18:47.357397: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-07-10 18:18:47.396445: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-07-10 18:18:47.404674: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-07-10 18:18:47.411615: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-07-10 18:18:47.458602: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-07-10 18:18:47.689152: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-07-10 18:18:47.690095: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-07-10 18:18:47.691017: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-07-10 18:18:47.701188: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x20fce644d50 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-10 18:18:47.701708: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-07-10 18:18:47.702573: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce RTX 2060 computeCapability: 7.5
coreClock: 1.335GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 312.97GiB/s
2020-07-10 18:18:47.703142: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-07-10 18:18:47.703421: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-07-10 18:18:47.703701: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-07-10 18:18:47.703979: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-07-10 18:18:47.704255: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-07-10 18:18:47.704539: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-07-10 18:18:47.704827: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-07-10 18:18:47.705628: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-07-10 18:18:48.633762: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-10 18:18:48.634075: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      0 
2020-07-10 18:18:48.634244: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0:   N 
2020-07-10 18:18:48.635184: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4602 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5)
2020-07-10 18:18:48.639308: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x20f87a4a760 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-07-10 18:18:48.639690: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce RTX 2060, Compute Capability 7.5
2020-07-10 18:18:49,452 - C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\site-packages\parrots\speech_recognition.py - DEBUG - Loading pinyin dict cost 0.016 seconds.
2020-07-10 18:18:49,514 - C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\site-packages\parrots\speech_recognition.py - DEBUG - Loading model cost 0.063 seconds.
2020-07-10 18:18:49,514 - C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\site-packages\parrots\speech_recognition.py - DEBUG - Speech recognition model has been built ok.
Traceback (most recent call last):
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.1.1\plugins\python-ce\helpers\pydev\pydevd.py", line 1438, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.1.1\plugins\python-ce\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:/Users/16413/Documents/GitHub/LostXmas/seq2seq/data/mining/SpeechRec/sr.py", line 4, in <module>
    text = parrots.recognize_speech_from_file('voice.wav')
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\site-packages\parrots\speech_recognition.py", line 203, in recognize_speech_from_file
    return self.recognize_speech(signal, fs)
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\site-packages\parrots\speech_recognition.py", line 184, in recognize_speech
    self.check_initialized()
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\site-packages\parrots\speech_recognition.py", line 69, in check_initialized
    self.initialize()
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\site-packages\parrots\speech_recognition.py", line 64, in initialize
    self.graph = tf.get_default_graph()
AttributeError: module 'tensorflow' has no attribute 'get_default_graph'

Process finished with exit code 1

调用m = TextToSpeech(speaker_model_path='shibing624/parrots-gpt-sovits-speaker-maimai', speaker_name='MaiMai') 报错

第一步调用就报错了,我的pytorch版本是2.2.1+cu121, 是不是太高了?

Cell In[3], line 1
----> 1 m = TextToSpeech(speaker_model_path='shibing624/parrots-gpt-sovits-speaker-maimai', speaker_name='MaiMai')

File e:\bomb\proj\python\BarkVoice\parrots\tts.py:342, in TextToSpeech.init(self, bert_model_path, hubert_model_path, sovits_model_path, gpt_model_path, speaker_model_path, speaker_name, device, half)
339 raise ValueError("sovits_model_path, gpt_model_path or speaker_model_path must be provided")
341 # SoVITS
--> 342 sovits_dict = torch.load(sovits_model_path, map_location="cpu")
343 hps = DictToAttrRecursive(sovits_dict["config"])
344 logger.debug(f"SoVITS config: {hps}")

File d:\CondaEnv\envs\normal\lib\site-packages\torch\serialization.py:1026, in load(f, map_location, pickle_module, weights_only, mmap, **pickle_load_args)
1024 except RuntimeError as e:
1025 raise pickle.UnpicklingError(UNSAFE_MESSAGE + str(e)) from None
-> 1026 return _load(opened_zipfile,
1027 map_location,
1028 pickle_module,
1029 overall_storage=overall_storage,
1030 **pickle_load_args)
1031 if mmap:
1032 raise RuntimeError("mmap can only be used with files saved with "
1033 "`torch.save(_use_new_zipfile_serialization=True), "
1034 "please torch.save your checkpoint with this option in order to use mmap.")

File d:\CondaEnv\envs\normal\lib\site-packages\torch\serialization.py:1438, in _load(zip_file, map_location, pickle_module, pickle_file, overall_storage, **pickle_load_args)
1436 unpickler = UnpicklerWrapper(data_file, **pickle_load_args)
1437 unpickler.persistent_load = persistent_load
-> 1438 result = unpickler.load()
1440 torch._utils._validate_loaded_sparse_tensors()
1441 torch._C._log_api_usage_metadata(
1442 "torch.load.metadata", {"serialization_id": zip_file.serialization_id()}
1443 )

File d:\CondaEnv\envs\normal\lib\site-packages\torch\serialization.py:1431, in _load..UnpicklerWrapper.find_class(self, mod_name, name)
1429 pass
1430 mod_name = load_module_mapping.get(mod_name, mod_name)
-> 1431 return super().find_class(mod_name, name)

ModuleNotFoundError: No module named 'utils'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.