Comments (12)
我这边,使用 v2,v3转到 faster-whisper 的模型,好像也没有 vad 成功。
Name: whisperx Version: 3.1.2
Name: faster-whisper Version: 1.0.1
测试用视频:https://www.youtube.com/watch?v=we8vNy6DYMI
v2 偶尔还会出现乱码 v3 的话,就算设置了 vad 也一样是30s 一个切片片段。
model = WhisperModel(model_size, device="cuda", compute_type="float16") model.feature_extractor.mel_filters = model.feature_extractor.get_mel_filters(model.feature_extractor.sampling_rate, model.feature_extractor.n_fft, n_mels=128) segments, info = model.transcribe(file, vad_filter=True, vad_parameters=dict(min_silence_duration_ms=500), language=language)
您好,我使用ct2-transformers-converter --model BELLE-2--Belle-whisper-large-v3-zh --output_dir BELLE-2--Belle-whisper-large-v3-zh-ct2 --copy_files preprocessor_config.json --quantization float16 这个命令将模型转换为faster-whisper格式,在加载模型时model = WhisperModel(model_size, device="cuda", compute_type="float16")提示错误:Max retries exceeded with url: /openai/whisper-tiny/resolve/main/tokenizer.json,请问为什么还要去huggingface.co下载这个tokenizer.json呀,正确的做法该怎么做呢,谢谢拉
from belle.
+1
from belle.
根据上面结果,大概原因可能是使用belle-whisper没有做vad切分,所以都是按照最长30秒做的识别,这样有一定的影响。
建议把belle-whisper转为fasterwhisper模型格式,基于faster-whisper框架去做推理,faster-whisper内置了vad 模块。速度和效果都有一定保证。
from belle.
根据上面结果,大概原因可能是使用belle-whisper没有做vad切分,所以都是按照最长30秒做的识别,这样有一定的影响。 建议把belle-whisper转为fasterwhisper模型格式,基于faster-whisper框架去做推理,faster-whisper内置了vad 模块。速度和效果都有一定保证。
belle-whisper转为fasterwhisper模型格式,请问这个怎么处理呢?有相关的技术资料吗?
from belle.
根据上面结果,大概原因可能是使用belle-whisper没有做vad切分,所以都是按照最长30秒做的识别,这样有一定的影响。 建议把belle-whisper转为fasterwhisper模型格式,基于faster-whisper框架去做推理,faster-whisper内置了vad 模块。速度和效果都有一定保证。
belle-whisper转为fasterwhisper模型格式,请问这个怎么处理呢?有相关的技术资料吗?
ct2-transformers-converter --model BELLE-2/Belle-whisper-large-v2-zh --output_dir Belle-whisper-large-v2-ct2 --copy_files preprocessor_config.json --quantization int8_float32
https://opennmt.net/CTranslate2/quantization.html#quantize-on-model-conversion
from belle.
但是whisper里默认是有vad的呀,你是指belle-whisper里把vad去掉了?
from belle.
你说的应该是 timestamps, belle-whisper 微调时没有进一步优化timestamp。如果需要timestamps需要在推理时主动打开。faster-whisper框架有vad,切分效果更好一些。所以建议用faster-whisper框架调用belle-whisper
from belle.
多谢大佬 我试试
from belle.
我这边,使用 v2,v3转到 faster-whisper 的模型,好像也没有 vad 成功。
Name: whisperx
Version: 3.1.2
Name: faster-whisper
Version: 1.0.1
测试用视频:https://www.youtube.com/watch?v=we8vNy6DYMI
v2 偶尔还会出现乱码
v3 的话,就算设置了 vad 也一样是30s 一个切片片段。
model = WhisperModel(model_size, device="cuda", compute_type="float16")
model.feature_extractor.mel_filters = model.feature_extractor.get_mel_filters(model.feature_extractor.sampling_rate, model.feature_extractor.n_fft, n_mels=128)
segments, info = model.transcribe(file, vad_filter=True, vad_parameters=dict(min_silence_duration_ms=500), language=language)
from belle.
e: whispe
请问你是怎么转的,我自己用命令行转没成功
from belle.
我这边,使用 v2,v3转到 faster-whisper 的模型,好像也没有 vad 成功。
Name: whisperx Version: 3.1.2
Name: faster-whisper Version: 1.0.1
测试用视频:https://www.youtube.com/watch?v=we8vNy6DYMI
v2 偶尔还会出现乱码 v3 的话,就算设置了 vad 也一样是30s 一个切片片段。
model = WhisperModel(model_size, device="cuda", compute_type="float16") model.feature_extractor.mel_filters = model.feature_extractor.get_mel_filters(model.feature_extractor.sampling_rate, model.feature_extractor.n_fft, n_mels=128) segments, info = model.transcribe(file, vad_filter=True, vad_parameters=dict(min_silence_duration_ms=500), language=language)
你好,我现在也遇到了这个问题,转成fasterwhisper之后,设置vad无效,还是30s,请问你有解决这个问题吗
from belle.
我这边,使用 v2,v3转到 faster-whisper 的模型,好像也没有 vad 成功。
Name: whisperx Version: 3.1.2
Name: faster-whisper Version: 1.0.1
测试用视频:https://www.youtube.com/watch?v=we8vNy6DYMI
v2 偶尔还会出现乱码 v3 的话,就算设置了 vad 也一样是30s 一个切片片段。
model = WhisperModel(model_size, device="cuda", compute_type="float16") model.feature_extractor.mel_filters = model.feature_extractor.get_mel_filters(model.feature_extractor.sampling_rate, model.feature_extractor.n_fft, n_mels=128) segments, info = model.transcribe(file, vad_filter=True, vad_parameters=dict(min_silence_duration_ms=500), language=language)你好,我现在也遇到了这个问题,转成fasterwhisper之后,设置vad无效,还是30s,请问你有解决这个问题吗
用whisperx,设置chunk_size可以指定vad的最大切分时长
from belle.
Related Issues (20)
- VL多模态模型数据集 HOT 1
- Belle-whisper-large-v2-zh 加载报错 HOT 1
- 为什么语音识别的文字前面有特殊符号:� ,如何取消呢??? HOT 1
- 请问基于transformers的中文转写可以加标点吗? HOT 2
- 请问有微信群吗? HOT 2
- Hi, thanks for your great work! But where is the code to train Belle-whisper-larger-v2-zh and Belle-distilwhisper-large-v2-zh? HOT 1
- 关于Belle-whisper-large-v2-zh模型分句的问题,请问这个模型是用有时间戳的分句数据训练的吗? HOT 3
- 有没有ggml模型呀? HOT 1
- 镜像不work
- Model inference speed is too slow (positively related to max_new_tokens length)
- 生成特定领域的高质量数据集
- can we fine-tunning on belle-whisper model HOT 1
- belle-whisper model take much more time even after transformed by ctranslate HOT 1
- large-v3-zh中文的效果变得更差了 HOT 2
- expect 'max_length' in BELLE-2/Belle-whisper-large-v3-zh/config.json when converting to ggml HOT 1
- large-v3-zh经过ctranslate转换之后识别中文音频,但是结果确实英文 HOT 3
- 有尝试过将belle-whisper-large-v3-zh结合到so-vits-svc任务吗?目前仅见到了将large-v3结合so-vits-svc的工作 HOT 1
- ValueError: Could not load model BELLE-2/Belle-whisper-large-v3-zh with any of the following classes:
- 使用whisperx调用时的问题
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from belle.