Comments (4)
Another question:we also have machine cluster, but the machine can not set device number
so, should I mask the following lines of code in the decode.py:
if torch.cuda.is_available():
device = torch.device("cuda", 0)
from icefall.
I meet the following error, I change the --max-duration, there are still errors:
There are several things you can do:
(1) Change to a GPU with a larger RAM, i.e., 32 GB.
(2) Use a decoding method that does not involve an LM, i.e., use --method 1best
(3) Change
icefall/egs/librispeech/ASR/tdnn_lstm_ctc/decode.py
Lines 423 to 424 in adb068e
to
G = k2.arc_sort(G)
G = k2.Fsa.from_fsas([G]).to(device)
I assume it will not cause OOM errors in later decoding steps.
(4) Prune your G. You can use the script from kaldi-asr/kaldi#4594 to prune your G.
(Note: It is a single python script, having no dependencies on Kaldi).
from icefall.
should I mask the following lines of code in the decode.py:
if torch.cuda.is_available():
device = torch.device("cuda", 0)
Can you use device = torch.device("cuda")
to select your default cuda device.
If you use CPU, it is going to be slow when you decode.
from icefall.
I meet the following error, I change the --max-duration, there are still errors:
There are several things you can do:
(1) Change to a GPU with a larger RAM, i.e., 32 GB. (2) Use a decoding method that does not involve an LM, i.e., use
--method 1best
(3) Changeicefall/egs/librispeech/ASR/tdnn_lstm_ctc/decode.py
Lines 423 to 424 in adb068e
to
G = k2.arc_sort(G) G = k2.Fsa.from_fsas([G]).to(device)I assume it will not cause OOM errors in later decoding steps. (4) Prune your G. You can use the script from kaldi-asr/kaldi#4594 to prune your G. (Note: It is a single python script, having no dependencies on Kaldi).
I try the (3) method, there are still errors:
2021-10-05 00:00:07,427 INFO [decode.py:387] Decoding started
2021-10-05 00:00:07,427 INFO [decode.py:388] {'exp_dir': PosixPath('tdnn_lstm_ctc/exp'), 'lang_dir': PosixPath('data/lang_phone'), 'lm_dir': PosixPath('data/lm'), 'feature_dim': 80, 'subsampling_factor': 3, 'search_beam': 20, 'output_beam': 5, 'min_active_states': 30, 'max_active_states': 10000, 'use_double_scores': True, 'epoch': 19, 'avg': 5, 'method': 'whole-lattice-rescoring', 'num_paths': 100, 'nbest_scale': 0.5, 'export': False, 'full_libri': True, 'feature_dir': PosixPath('data/fbank'), 'max_duration': 100, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'return_cuts': True, 'num_workers': 2}
2021-10-05 00:00:07,947 INFO [lexicon.py:113] Loading pre-compiled data/lang_phone/Linv.pt
2021-10-05 00:00:08,310 INFO [decode.py:397] device: cuda
2021-10-05 00:00:46,069 INFO [decode.py:410] Loading G_4_gram.fst.txt
2021-10-05 00:00:46,070 WARNING [decode.py:411] It may take 8 minutes.
Traceback (most recent call last):
File "./tdnn_lstm_ctc/decode.py", line 497, in
main()
File "/opt/conda/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context
return func(*args, **kwargs)
File "./tdnn_lstm_ctc/decode.py", line 435, in main
G = k2.add_epsilon_self_loops(G)
File "/opt/conda/lib/python3.8/site-packages/k2-1.8.dev20210918+cuda11.0.torch1.7.1-py3.8-linux-x86_64.egg/k2/fsa_algo.py", line 499, in add_epsilon_self_loops
ragged_arc, arc_map = _k2.add_epsilon_self_loops(fsa.arcs,
RuntimeError: CUDA out of memory. Tried to allocate 4.73 GiB (GPU 0; 15.78 GiB total capacity; 9.21 GiB already allocated; 3.90 GiB free; 10.85 GiB reserved in total by PyTorch)
I think I should try the (1)
from icefall.
Related Issues (20)
- low resource data HOT 1
- Identical Batches Across Multiple GPUs HOT 2
- CTC/AED PROBLEM IN K2 HOT 10
- append features HOT 1
- CTC/AED PROBLEMS IN EXPORTING JIT MODULE HOT 4
- Error happens with egs/librispeech/ASR/prepare_mmi.sh HOT 3
- Use CutSet.mux to effect? HOT 8
- Help with training/finetuning a zipformer based model HOT 6
- Different Training Loss with Single Node (8 GPUs) vs. Two Nodes (4 GPUs Each)
- Data cleaning HOT 3
- ONNX decode error HOT 2
- OTC with conformer librispeech/WASR isn't converage.
- ONNX bug HOT 9
- Questions about modifying prepare.sh for training ASR model on custom data HOT 2
- How to use my own dataset based on another dataset HOT 3
- kaldifeat installation error HOT 2
- Why unique lexicon is needed in Chinese ASR, but not in English ASR?
- Error during training OTC conformer_ctc2 HOT 1
- What is difference between zipformer and zipformer_ctc models? HOT 1
- ONNX and Torch models HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from icefall.