Notice: In order to resolve issues more efficiently, please raise issue following the

compute_audio_cmvn.py 脚本运行时 Segmentation fault,about alibaba-damo-academy/funasr

Comments (2)

hicliff commented on September 23, 2024

使用funasr里的镜像搭建环境，跑通了，可能跟我的环境有关。我的系统是CentOS Linux release 7.6.1810，我看镜像里的系统是Ubuntu 20.04.4 LTS

from funasr.

bigchou commented on September 23, 2024

My system environment is:
system: Ubuntu 20.04.6 LTS
funasr.version == '1.0.16'
pytorch version = 2.2.1

I also encountered this problem, but setting nj to 0 in run.sh allowed me to run CMVN successfully.

However, I then encountered other errors in stage 4:

[2024-03-17 01:39:37,527] torch.distributed.run: [WARNING]
[2024-03-17 01:39:37,527] torch.distributed.run: [WARNING] *****************************************
[2024-03-17 01:39:37,527] torch.distributed.run: [WARNING] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
[2024-03-17 01:39:37,527] torch.distributed.run: [WARNING] *****************************************
If you want to use the speaker diarization, please pip install hdbscan
If you want to use the speaker diarization, please pip install hdbscan
{'model': 'Paraformer', 'model_conf': {'ctc_weight': 0.3, 'lsm_weight': 0.1, 'length_normalized_loss': False, 'predictor_weight': 1.0, 'sampling_ratio': 0.4, 'use_1st_decoder_loss': True}, 'encoder': 'ConformerEncoder', 'encoder_conf': {'output_size': 256, 'attention_heads': 4, 'linear_units': 2048, 'num_blocks': 12, 'dropout_rate': 0.1, 'positional_dropout_rate': 0.1, 'attention_dropout_rate': 0.0, 'input_layer': 'conv2d', 'normalize_before': True, 'pos_enc_layer_type': 'rel_pos', 'selfattention_layer_type': 'rel_selfattn', 'activation_type': 'swish', 'macaron_style': True, 'use_cnn_module': True, 'cnn_module_kernel': 15}, 'decoder': 'ParaformerSANDecoder', 'decoder_conf': {'attention_heads': 4, 'linear_units': 2048, 'num_blocks': 6, 'dropout_rate': 0.1, 'positional_dropout_rate': 0.1, 'self_attention_dropout_rate': 0.0, 'src_attention_dropout_rate': 0.0}, 'predictor': 'CifPredictor', 'predictor_conf': {'idim': 256, 'threshold': 1.0, 'l_order': 1, 'r_order': 1, 'tail_threshold': 0.45}, 'frontend': 'WavFrontend', 'frontend_conf': {'fs': 16000, 'window': 'hamming', 'n_mels': 80, 'frame_length': 25, 'frame_shift': 10, 'lfr_m': 1, 'lfr_n': 1, 'cmvn_file': '../DATA/data/train/am.mvn'}, 'specaug': 'SpecAug', 'specaug_conf': {'apply_time_warp': True, 'time_warp_window': 5, 'time_warp_mode': 'bicubic', 'apply_freq_mask': True, 'freq_mask_width_range': [0, 30], 'num_freq_mask': 2, 'apply_time_mask': True, 'time_mask_width_range': [0, 40], 'num_time_mask': 2}, 'train_conf': {'accum_grad': 1, 'grad_clip': 5, 'max_epoch': 150, 'keep_nbest_models': 10, 'avg_nbest_model': 5, 'log_interval': 50}, 'optim': 'adam', 'optim_conf': {'lr': 0.0005}, 'scheduler': 'warmuplr', 'scheduler_conf': {'warmup_steps': 30000}, 'dataset': 'AudioDataset', 'dataset_conf': {'index_ds': 'IndexDSJsonl', 'batch_sampler': 'RankFullLocalShuffleBatchSampler', 'batch_type': 'example', 'batch_size': 32, 'max_token_length': 2048, 'buffer_size': 1024, 'shuffle': True, 'num_workers': 4, 'preprocessor_speech': 'SpeechPreprocessSpeedPerturb', 'preprocessor_speech_conf': {'speed_perturb': [0.9, 1.0, 1.1]}}, 'tokenizer': 'CharTokenizer', 'tokenizer_conf': {'unk_symbol': '', 'token_list': '../DATA/data/zh_token_list/char/tokens.txt'}, 'ctc_conf': {'dropout_rate': 0.0, 'ctc_type': 'builtin', 'reduce': True, 'ignore_nan_grad': True}, 'normalize': None, 'train_data_set_list': '../DATA/data/train/audio_datasets.jsonl', 'valid_data_set_list': '../DATA/data/dev/audio_datasets.jsonl', 'output_dir': '/alghome/timmy.wan/whisper/lab/VariousLargeWhisper/FunASR/examples/aishell/paraformer/exp/baseline_paraformer_conformer_12e_6d_2048_256_zh_char_exp1'}
{'model': 'Paraformer', 'model_conf': {'ctc_weight': 0.3, 'lsm_weight': 0.1, 'length_normalized_loss': False, 'predictor_weight': 1.0, 'sampling_ratio': 0.4, 'use_1st_decoder_loss': True}, 'encoder': 'ConformerEncoder', 'encoder_conf': {'output_size': 256, 'attention_heads': 4, 'linear_units': 2048, 'num_blocks': 12, 'dropout_rate': 0.1, 'positional_dropout_rate': 0.1, 'attention_dropout_rate': 0.0, 'input_layer': 'conv2d', 'normalize_before': True, 'pos_enc_layer_type': 'rel_pos', 'selfattention_layer_type': 'rel_selfattn', 'activation_type': 'swish', 'macaron_style': True, 'use_cnn_module': True, 'cnn_module_kernel': 15}, 'decoder': 'ParaformerSANDecoder', 'decoder_conf': {'attention_heads': 4, 'linear_units': 2048, 'num_blocks': 6, 'dropout_rate': 0.1, 'positional_dropout_rate': 0.1, 'self_attention_dropout_rate': 0.0, 'src_attention_dropout_rate': 0.0}, 'predictor': 'CifPredictor', 'predictor_conf': {'idim': 256, 'threshold': 1.0, 'l_order': 1, 'r_order': 1, 'tail_threshold': 0.45}, 'frontend': 'WavFrontend', 'frontend_conf': {'fs': 16000, 'window': 'hamming', 'n_mels': 80, 'frame_length': 25, 'frame_shift': 10, 'lfr_m': 1, 'lfr_n': 1, 'cmvn_file': '../DATA/data/train/am.mvn'}, 'specaug': 'SpecAug', 'specaug_conf': {'apply_time_warp': True, 'time_warp_window': 5, 'time_warp_mode': 'bicubic', 'apply_freq_mask': True, 'freq_mask_width_range': [0, 30], 'num_freq_mask': 2, 'apply_time_mask': True, 'time_mask_width_range': [0, 40], 'num_time_mask': 2}, 'train_conf': {'accum_grad': 1, 'grad_clip': 5, 'max_epoch': 150, 'keep_nbest_models': 10, 'avg_nbest_model': 5, 'log_interval': 50}, 'optim': 'adam', 'optim_conf': {'lr': 0.0005}, 'scheduler': 'warmuplr', 'scheduler_conf': {'warmup_steps': 30000}, 'dataset': 'AudioDataset', 'dataset_conf': {'index_ds': 'IndexDSJsonl', 'batch_sampler': 'RankFullLocalShuffleBatchSampler', 'batch_type': 'example', 'batch_size': 32, 'max_token_length': 2048, 'buffer_size': 1024, 'shuffle': True, 'num_workers': 4, 'preprocessor_speech': 'SpeechPreprocessSpeedPerturb', 'preprocessor_speech_conf': {'speed_perturb': [0.9, 1.0, 1.1]}}, 'tokenizer': 'CharTokenizer', 'tokenizer_conf': {'unk_symbol': '', 'token_list': '../DATA/data/zh_token_list/char/tokens.txt'}, 'ctc_conf': {'dropout_rate': 0.0, 'ctc_type': 'builtin', 'reduce': True, 'ignore_nan_grad': True}, 'normalize': None, 'train_data_set_list': '../DATA/data/train/audio_datasets.jsonl', 'valid_data_set_list': '../DATA/data/dev/audio_datasets.jsonl', 'output_dir': '/alghome/timmy.wan/whisper/lab/VariousLargeWhisper/FunASR/examples/aishell/paraformer/exp/baseline_paraformer_conformer_12e_6d_2048_256_zh_char_exp1'}

tables:

[2024-03-17 01:39:43,764][root][WARNING] - Using legacy_rel_pos and it will be deprecated in the future.
[2024-03-17 01:39:43,776][root][INFO] - config.yaml is saved to: /alghome/timmy.wan/whisper/lab/VariousLargeWhisper/FunASR/examples/aishell/paraformer/exp/baseline_paraformer_conformer_12e_6d_2048_256_zh_char_exp1/config.yaml
[2024-03-17 01:39:43,780][root][WARNING] - Using legacy_rel_pos and it will be deprecated in the future.
[2024-03-17 01:39:43,800][root][WARNING] - Using legacy_rel_selfattn and it will be deprecated in the future.
[2024-03-17 01:39:43,816][root][WARNING] - Using legacy_rel_selfattn and it will be deprecated in the future.
No initialize method
No initialize method
[2024-03-17 01:39:44,941][root][INFO] - total_num of samplers across ranks: 120098
[2024-03-17 01:39:44,944][root][INFO] - total_num of samplers across ranks: 120098
[2024-03-17 01:39:44,996][root][INFO] - total_num of samplers across ranks: 14326
No checkpoint found at '/alghome/timmy.wan/whisper/lab/VariousLargeWhisper/FunASR/examples/aishell/paraformer/exp/baseline_paraformer_conformer_12e_6d_2048_256_zh_char_exp1/model.pt', does not resume status!
[2024-03-17 01:39:45,000][root][INFO] - total_num of samplers across ranks: 14326
No checkpoint found at '/alghome/timmy.wan/whisper/lab/VariousLargeWhisper/FunASR/examples/aishell/paraformer/exp/baseline_paraformer_conformer_12e_6d_2048_256_zh_char_exp1/model.pt', does not resume status!

rank: 0, Training Epoch: 1: 0%|�[34m �[0m| 0/1877 [00:00<?, ?it/s]
rank: 1, Training Epoch: 1: 0%|�[34m �[0m| 0/1877 [00:00<?, ?it/s]ERROR: Unexpected segmentation fault encountered in worker.

from funasr.

compute_audio_cmvn.py 脚本运行时 Segmentation fault about funasr HOT 2 OPEN

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent