Git Product home page Git Product logo

Comments (9)

AlexandderGorodetski avatar AlexandderGorodetski commented on June 22, 2024 1

Completed.

from icefall.

csukuangfj avatar csukuangfj commented on June 22, 2024

Please tell us the exact command you are using and also please tell us the duration of your test wave.

from icefall.

AlexandderGorodetski avatar AlexandderGorodetski commented on June 22, 2024

The duration of test waves is 30sec.
I use following command for the decoding.

export PYTHONPATH='/workspace/inputs/alexg/asr/src/models/k2_2024/icefall/egs/tedlium3/ASR/zipformer:/workspace/inputs/alexg/asr/src/models/k2_2024/icefall:$PYTHONPATH'

export CUDA_VISIBLE_DEVICES="0"

python ./zipformer/onnx_pretrained.py
--encoder-model-filename zipformer/exp/encoder-epoch-50-avg-1.onnx
--decoder-model-filename zipformer/exp/decoder-epoch-50-avg-1.onnx
--joiner-model-filename zipformer/exp/joiner-epoch-50-avg-1.onnx
--tokens data/lang_bpe_500/tokens.txt
/workspace/inputs/alexg/asr/src/projects/en_eval/input/test_1.wav

from icefall.

csukuangfj avatar csukuangfj commented on June 22, 2024

Could you use a shorter wave, e.g., less than 10 seconds or 20 seconds ?

from icefall.

AlexandderGorodetski avatar AlexandderGorodetski commented on June 22, 2024

Great.
Well done.

For 10 seconds ONNX decoder works properly. Is it possible to add support to 30 sec or should I have to update my VAD so that it will not produce segments longer than 10sec?

from icefall.

csukuangfj avatar csukuangfj commented on June 22, 2024

Is it possible to add support to 30 sec

Yes, absolutely.

Please change


to a larger value, re-export your model, and re-try.

from icefall.

AlexandderGorodetski avatar AlexandderGorodetski commented on June 22, 2024

I changed the value from 1,000 to 10,000 and it did not help.

I found that maximal time that I can work with is 20sec. For 21sec I already have an error.

Maybe I should change this value during the training, maybe this value is saved somewhere in the model?

from icefall.

csukuangfj avatar csukuangfj commented on June 22, 2024

There must be some constant value about the length of some positional encoding vector in the code. You need to find and change it.

from icefall.

AlexandderGorodetski avatar AlexandderGorodetski commented on June 22, 2024

You are right. max_len currently represents time of 20 sec. It can be increased to 2,000 and then maximal time will be increase to 40 sec. But it is important to perform this change BEFORE exporting the ONNX model.

Thank you so much, this issue can be closed.

from icefall.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.