Comments (1)
That's nice!
For each doubling of parameter size, I would probably:
- increase --feedforward-dim and --encoder-dim value by around a factor of sqrt(2), e.g. increase
512,768,1024,1536,1024,768 to 768,1024,1536,2048,1536,768. This is the thing we mostly change when we change the model size. - You may also want to increase query-head-dim and value-head-dim a bit for the larger models, and also the num-heads can be increased a little bit, but don't increase the num-heads too much as it will increase memory requirements if sequences are long. I.e. you probably want to increase by less than the encoder and feedforward dims.
- joiner-dim and decoder-dim could perhaps be increased slightly, probably not to much more than 768, if you are using RNN-T and want a strong joiner. I also recomment to increase the num-encoder-layers very slightly... it's currently 2,2,3,4,3,2... don't increase it by as much as you increase the dimensions though.
- You probably don't need to increase encoder-unmasked-dim.
- You could increase pos-dim and pos-head-dim slightly also, just on general principles of increasing things although this will probably make little difference.
- If you are feeling brave and have time to experiment you could also try adding a central more-downsampled layer, e.g. change from "1,2,4,8,4,2" to "1,2,4,8,16,8,4,2". That will require changing other comma-separated dims according to the patterns you can see. The good thing about this is that it will require very little extra memory during training but will increase the parameters by a lot. Caution: in our 1000h LIbrispeech setup, this made the results worse; but I suspect this was overfitting.
from icefall.
Related Issues (20)
- Feature extraction for 5000 hours of data HOT 4
- Plans to make installation simpler HOT 14
- How to use an external RNN-LM (mono-lingual) with a bilingual ASR? HOT 3
- json.decoder.JSONDecodeError,when I run wenetspeech prepare.sh HOT 1
- kaldi经典的强制对齐算法怎么在k2实现呢 HOT 1
- export a non-stream onnx model from a streaming pytorch model HOT 6
- A question about the data preparation on AMI corpus HOT 9
- Decoding conformer_ctc trained on TIMIT with ctc-decoding HOT 24
- 关于wenetspeech的指标是不是有一点问题 HOT 5
- What is the purpose of --lr-hours config in LibriHeavy recipe? HOT 2
- Using a BTC/OTC in the training Zipformer instead of Conformer. HOT 10
- Decoding Issue: fast beam search nbest LG HOT 1
- Is there any recipe for a Spanish model? HOT 1
- Is it possible to do reverberation on the fly? HOT 7
- Mamba implementation under icefall HOT 1
- initial decoder input in onnx decoding results in deletion errors HOT 1
- 使用sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13模型进行语音识别,每次重新启动时都有首字不能识别的问题。 HOT 1
- Decoding using LM with Contextual biasing (Hotwords)
- Integrating Phone-Based lang (Lexicon ) into Zipformer Model HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from icefall.