Hi, I am newer to learn icefall,I finished the training of tdnn_lstm_ctc, when run the

CUDA out of memory in decoding about icefall HOT 4 CLOSED

k2-fsa commented on June 20, 2024

CUDA out of memory in decoding

from icefall.

Comments (4)

cdxie commented on June 20, 2024

Another question：we also have machine cluster, but the machine can not set device number
so, should I mask the following lines of code in the decode.py:
if torch.cuda.is_available():
device = torch.device("cuda", 0)

from icefall.

csukuangfj commented on June 20, 2024

I meet the following error, I change the --max-duration, there are still errors:

There are several things you can do:

(1) Change to a GPU with a larger RAM, i.e., 32 GB.
(2) Use a decoding method that does not involve an LM, i.e., use --method 1best
(3) Change

icefall/egs/librispeech/ASR/tdnn_lstm_ctc/decode.py

Lines 423 to 424 in adb068e

 G = k2.Fsa.from_fsas([G]).to(device) 

 G = k2.arc_sort(G)

 G = k2.arc_sort(G) 
 G = k2.Fsa.from_fsas([G]).to(device)

I assume it will not cause OOM errors in later decoding steps.
(4) Prune your G. You can use the script from kaldi-asr/kaldi#4594 to prune your G.
(Note: It is a single python script, having no dependencies on Kaldi).

from icefall.

csukuangfj commented on June 20, 2024

should I mask the following lines of code in the decode.py:
if torch.cuda.is_available():
device = torch.device("cuda", 0)

Can you use device = torch.device("cuda") to select your default cuda device.

If you use CPU, it is going to be slow when you decode.

from icefall.

cdxie commented on June 20, 2024

I meet the following error, I change the --max-duration, there are still errors:

There are several things you can do:

(1) Change to a GPU with a larger RAM, i.e., 32 GB. (2) Use a decoding method that does not involve an LM, i.e., use --method 1best (3) Change

icefall/egs/librispeech/ASR/tdnn_lstm_ctc/decode.py

Lines 423 to 424 in adb068e

G = k2.Fsa.from_fsas([G]).to(device)

G = k2.arc_sort(G)

to
 G = k2.arc_sort(G) 
 G = k2.Fsa.from_fsas([G]).to(device) 
I assume it will not cause OOM errors in later decoding steps. (4) Prune your G. You can use the script from kaldi-asr/kaldi#4594 to prune your G. (Note: It is a single python script, having no dependencies on Kaldi).

I try the (3) method, there are still errors:

2021-10-05 00:00:07,427 INFO [decode.py:387] Decoding started
2021-10-05 00:00:07,427 INFO [decode.py:388] {'exp_dir': PosixPath('tdnn_lstm_ctc/exp'), 'lang_dir': PosixPath('data/lang_phone'), 'lm_dir': PosixPath('data/lm'), 'feature_dim': 80, 'subsampling_factor': 3, 'search_beam': 20, 'output_beam': 5, 'min_active_states': 30, 'max_active_states': 10000, 'use_double_scores': True, 'epoch': 19, 'avg': 5, 'method': 'whole-lattice-rescoring', 'num_paths': 100, 'nbest_scale': 0.5, 'export': False, 'full_libri': True, 'feature_dir': PosixPath('data/fbank'), 'max_duration': 100, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'return_cuts': True, 'num_workers': 2}
2021-10-05 00:00:07,947 INFO [lexicon.py:113] Loading pre-compiled data/lang_phone/Linv.pt
2021-10-05 00:00:08,310 INFO [decode.py:397] device: cuda
2021-10-05 00:00:46,069 INFO [decode.py:410] Loading G_4_gram.fst.txt
2021-10-05 00:00:46,070 WARNING [decode.py:411] It may take 8 minutes.
Traceback (most recent call last):
File "./tdnn_lstm_ctc/decode.py", line 497, in
main()
File "/opt/conda/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context
return func(*args, **kwargs)
File "./tdnn_lstm_ctc/decode.py", line 435, in main
G = k2.add_epsilon_self_loops(G)
File "/opt/conda/lib/python3.8/site-packages/k2-1.8.dev20210918+cuda11.0.torch1.7.1-py3.8-linux-x86_64.egg/k2/fsa_algo.py", line 499, in add_epsilon_self_loops
ragged_arc, arc_map = _k2.add_epsilon_self_loops(fsa.arcs,
RuntimeError: CUDA out of memory. Tried to allocate 4.73 GiB (GPU 0; 15.78 GiB total capacity; 9.21 GiB already allocated; 3.90 GiB free; 10.85 GiB reserved in total by PyTorch)

I think I should try the (1)

from icefall.

CUDA out of memory in decoding about icefall HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent