Comments (11)
@tz301 Thanks, will update the documents.
@pkufool I have reproduce the results, excellent!
Some advice is that valid loss is missed, since the data per epoch is so small and valid loss computation will be skiped. During my reproduction, I have change valid log interval to 1000 and we can see valid loss on tensorboard below. CER is nearly the same (4.24%), compared with master (4.26%).
from icefall.
There is an ongoing PR for the AISHELL dataset. Will be ready soon, I think. @pkufool
from icefall.
there is a PR #123
from icefall.
there is a PR #123
thanks~
from icefall.
There is an ongoing PR for the AISHELL dataset. Will be ready soon, I think. @pkufool
thanks~
from icefall.
There is an ongoing PR for the AISHELL dataset. Will be ready soon, I think. @pkufool
thanks~
from icefall.
It has been merged. You can find it at https://github.com/k2-fsa/icefall/tree/master/egs
from icefall.
That's very cool work.
When I see the aishell document and reproduce it, I find my att loss has obvious mismatch (much higher) with that in the tutorial docs (ctc loss is ok). I found that was caused by the change of LabelSmoothing algorithm in #109. Maybe @pkufool can help to update that? I'm a little confusing at first glance. So maybe also for other new users.
Also, if there will be some scripts to deal with both Chinese and English, that will be very great. Maybe combine bpe for English and char for Chinese?. Chinese with English (code-switch) is so common today!
from icefall.
When I see the aishell document and reproduce it, I find my att loss has obvious mismatch (much higher) with that in the tutorial docs (ctc loss is ok)
Please see #107
Before the change, the label smoothing loss has an additional term
q * log(q)
which is negative as q is in the range (0, 1); therefore the loss before the change is smaller, but it should not affect your final WER.
from icefall.
@tz301 Thanks, will update the documents.
from icefall.
Also, if there will be some scripts to deal with both Chinese and English, that will be very great. Maybe combine bpe for English and char for Chinese?. Chinese with English (code-switch) is so common today!
Thanks, it is a good idea, will consider adding code-switch recipe. You are very welcome to contribute this recipe as well, if you have time.
from icefall.
Related Issues (20)
- zipformer-adapter streaming_forward without adapters. HOT 4
- Feature extraction for 5000 hours of data HOT 4
- Plans to make installation simpler HOT 14
- How to use an external RNN-LM (mono-lingual) with a bilingual ASR? HOT 3
- json.decoder.JSONDecodeError,when I run wenetspeech prepare.sh HOT 1
- kaldi经典的强制对齐算法怎么在k2实现呢 HOT 1
- export a non-stream onnx model from a streaming pytorch model HOT 6
- A question about the data preparation on AMI corpus HOT 9
- Decoding conformer_ctc trained on TIMIT with ctc-decoding HOT 24
- 关于wenetspeech的指标是不是有一点问题 HOT 5
- What is the purpose of --lr-hours config in LibriHeavy recipe? HOT 2
- Using a BTC/OTC in the training Zipformer instead of Conformer. HOT 10
- Decoding Issue: fast beam search nbest LG HOT 1
- Is there any recipe for a Spanish model? HOT 1
- Is it possible to do reverberation on the fly? HOT 9
- Mamba implementation under icefall HOT 1
- Seeking advice on parameter configuration and settings for large-scale ASR models HOT 1
- initial decoder input in onnx decoding results in deletion errors HOT 1
- 使用sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13模型进行语音识别,每次重新启动时都有首字不能识别的问题。 HOT 1
- Decoding using LM with Contextual biasing (Hotwords)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from icefall.