Comments (6)
setting chunk-size=-1
and left-context-frames=-1
does not mean 'non-streaming', it just means the model gets full context but the model is still 'streaming', i.e. the convolutions are still causal. By exporting the model by nonstreaming script you get rid of the ability to use the cache, which is the whole point of training streaming model. From my testing setting chunk-size=512
and left-context-frames=512
gives the best WER (which is not suprising given context is king
) if WER is what you care about while still having the ability to 'stream' (just not in realtime)
from icefall.
from icefall.
hi jin, thanks for your reply! More than 100k hours are used to train the model, training will take about 20days due to limited gpus. As for the streaming model, the non-stream decoding options (chunk-size=-1, --left-context-frames=-1) shows relative 20%~30% better wer than the streaming decoding options (chunk-size=32, --left-context-frames=128), so I decide to export the non-streaming model.
from icefall.
我在看这个
from icefall.
我在看这个
您好,那个错误我发现是因为这个函数里面有个if else判断语句
导出onnx的时候输入的input是
所以走的上面那个if else中if,我在用导出的模型测试的时候语音都是十几秒的语音,所以会报那个维度不匹配的错误。如果把这个dummy input改成x = torch.zeros(1, 1000, 80, dtype=torch.float32)就会走else那个分支,就不会报错可以正常识别了。但是,现在是我不知道怎么把这个if else合并或者拆分让他长短语音都能用
from icefall.
Replied in the Next-gen Kaldi WeChat group.
The fix is
diff --git a/egs/librispeech/ASR/zipformer/scaling_converter.py b/egs/librispeech/ASR/zipformer/scaling_converter.py
index 76622fa1..346db55e 100644
--- a/egs/librispeech/ASR/zipformer/scaling_converter.py
+++ b/egs/librispeech/ASR/zipformer/scaling_converter.py
@@ -36,7 +36,7 @@ from scaling import (
SwooshROnnx,
Whiten,
)
-from zipformer import CompactRelPositionalEncoding
+from zipformer import CompactRelPositionalEncoding, ChunkCausalDepthwiseConv1d
# Copied from https://pytorch.org/docs/1.9.0/_modules/torch/nn/modules/module.html#Module.get_submodule # noqa
@@ -93,6 +93,10 @@ def convert_scaled_to_non_scaled(
# the input changes, so we have to use torch.jit.script()
# to replace torch.jit.trace()
d[name] = torch.jit.script(m)
+ elif is_onnx and isinstance(m, ChunkCausalDepthwiseConv1d):
+ # to export a zipformer model that is trained with --causal=1
+ # but to export it with --chunk-size=-1 and --left-chunk-size=-1
+ d[name] = torch.jit.script(m)
for k, v in d.items():
if "." in k:
from icefall.
Related Issues (20)
- pytorch ver. `>=2.1.0` breaks compatibility with all `conformer_ctc` recipes
- Multi Lingual model HOT 1
- low resource data HOT 1
- Identical Batches Across Multiple GPUs HOT 2
- CTC/AED PROBLEM IN K2 HOT 10
- append features HOT 1
- CTC/AED PROBLEMS IN EXPORTING JIT MODULE HOT 4
- Error happens with egs/librispeech/ASR/prepare_mmi.sh HOT 3
- Use CutSet.mux to effect? HOT 10
- Help with training/finetuning a zipformer based model HOT 6
- Different Training Loss with Single Node (8 GPUs) vs. Two Nodes (4 GPUs Each)
- Data cleaning HOT 3
- ONNX decode error HOT 2
- OTC with conformer librispeech/WASR isn't converage.
- ONNX bug HOT 9
- Questions about modifying prepare.sh for training ASR model on custom data HOT 2
- How to use my own dataset based on another dataset HOT 3
- kaldifeat installation error HOT 2
- Why unique lexicon is needed in Chinese ASR, but not in English ASR?
- Error during training OTC conformer_ctc2 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from icefall.