Comments (2)
Your lr is very high, start with 3e-4 for finetuning. Model might have exploded to NaNs. Could you swicth to fastconformer architecture instead of Conformer? FastConformer is quick to train. start with fp32 then move to precision 16 once your training setup is fine and you see curves are normal.
Fastconformer configs: https://github.com/NVIDIA/NeMo/tree/main/examples/asr/conf/fastconformer
thanks for replying. i actually changed the precision from 16 to 32 and it solved my problem.
from nemo.
Your lr is very high, start with 3e-4 for finetuning. Model might have exploded to NaNs. Could you swicth to fastconformer architecture instead of Conformer? FastConformer is quick to train. start with fp32 then move to precision 16 once your training setup is fine and you see curves are normal.
Fastconformer configs: https://github.com/NVIDIA/NeMo/tree/main/examples/asr/conf/fastconformer
from nemo.
Related Issues (20)
- Object shard /models/Nemotron-4-340B-Reward/model_weights/model.rm_head._extra_state/shard_0_1.pt not found HOT 3
- When should mcore_gpt: True be used?
- Add sequence packing and proper attention masking support for LLM pretraining? HOT 1
- Util for measuring MFU? HOT 5
- RuntimeError: Error(s) in loading state_dict for MegaMolBARTModel after ANY fine tuning HOT 3
- Add KV-Cache for MegatronLMEncoderDecoderModel HOT 3
- Question: Which decoder are we supposed to use on parakeet-tdt_ctc-1.1b model? HOT 3
- Not work even use the official docker when multiple GPU training LLM HOT 3
- CPU memory keeps increasing in every step during training LLM with Nemo framework? HOT 5
- Unable to reproduce cache aware streaming results for Conformer that were there for Fastconformer.
- I wanted to train a multitask model, like canary. needed more information on how to build the tokenizer, and data manifest file. HOT 3
- Why did you use dynamic bucketing in Canary training and didnt in stt eu asr training HOT 4
- Question: Difference of paths2audio_files param and path in manifest file for speaker diarization with ClusteringDiarizer HOT 3
- AssertionError: (RMSNorm) is not supported in FusedLayerNorm HOT 2
- ImportError: cannot import name 'ModelFilter' from 'huggingface_hub' while importing from nemo.collections.asr.models import EncDecMultiTaskModel HOT 4
- AttributeError: module 'pkgutil' has no attribute 'ImpImporter'. Did you mean: 'zipimporter'? HOT 3
- [BUG] MMapIndexedDatasetBuilder.merge_file_ does not properly populate _doc_idx
- speech_to_text_finetune `hparams` not updated HOT 1
- pkg_resources is deprecated in setuptools >= 70.0.0
- Differences Between Open-Source and AI Enterprise NeMo
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nemo.