Git Product home page Git Product logo

Comments (7)

JinZr avatar JinZr commented on August 23, 2024 2

from icefall.

csukuangfj avatar csukuangfj commented on August 23, 2024

Please see

if self.args.on_the_fly_feats:
# NOTE: the PerturbSpeed transform should be added only if we
# remove it from data prep stage.
# Add on-the-fly speed perturbation; since originally it would
# have increased epoch size by 3, we will apply prob 2/3 and use
# 3x more epochs.
# Speed perturbation probably should come first before
# concatenation, but in principle the transforms order doesn't have
# to be strict (e.g. could be randomized)
# transforms = [PerturbSpeed(factors=[0.9, 1.1], p=2/3)] + transforms # noqa
# Drop feats to be on the safe side.
train = K2SpeechRecognitionDataset(
cut_transforms=transforms,
input_strategy=OnTheFlyFeatures(
FbankConfig(sampling_rate=8000, num_mel_bins=23)
),
return_cuts=self.args.return_cuts,
)

You need to

  1. "--on-the-fly-feats",

Pass --on-the-fly-feats=true to train.py

  1. Uncomment
    # transforms = [PerturbSpeed(factors=[0.9, 1.1], p=2/3)] + transforms # noqa

from icefall.

littlecuoge avatar littlecuoge commented on August 23, 2024

Ah, I meant the reverberation with impulse response, not the speed perturb. Thank you @JinZr , please share your doc.

from icefall.

littlecuoge avatar littlecuoge commented on August 23, 2024

I tried to add rir into the first place in transforms, like this:

transforms.append(
                ReverbWithImpulseResponse(p=0.5)
            )

But got an error:
-- Process 3 terminated with the following error: Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/torch/multiprocessing/spawn.py", line 69, in _wrap fn(i, *args) File "/icefall/egs/easy_start/ASR/zipformer/train.py", line 1265, in run train_one_epoch( File "/icefall/egs/easy_start/ASR/zipformer/train.py", line 941, in train_one_epoch for batch_idx, batch in enumerate(train_dl): File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 442, in __iter__ return self._get_iterator() File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 388, in _get_iterator return _MultiProcessingDataLoaderIter(self) File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1043, in __init__ w.start() File "/usr/lib/python3.10/multiprocessing/process.py", line 121, in start self._popen = self._Popen(self) File "/usr/lib/python3.10/multiprocessing/context.py", line 224, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "/usr/lib/python3.10/multiprocessing/context.py", line 288, in _Popen return Popen(process_obj) File "/usr/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 32, in __init__ super().__init__(process_obj) File "/usr/lib/python3.10/multiprocessing/popen_fork.py", line 19, in __init__ self._launch(process_obj) File "/usr/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 47, in _launch reduction.dump(process_obj, fp) File "/usr/lib/python3.10/multiprocessing/reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) TypeError: cannot pickle 'module' object
Not sure why?

from icefall.

pzelasko avatar pzelasko commented on August 23, 2024

Looks like a bug in Lhotse, will fix. You can probably solve this by setting env var LHOTSE_DILL_ENABLED=1 or using the cuts = cuts.reverb_rir() API.

from icefall.

littlecuoge avatar littlecuoge commented on August 23, 2024

@pzelasko Thanks for your replying. I tried LHOTSE_DILL_ENABLED=1. Looks like it works, but it takes about 10min to train 50 batches. While I added RIR and MUSAN noise at the same time, but it still takes too much time. What do you think?

from icefall.

pzelasko avatar pzelasko commented on August 23, 2024

Was it faster without RIR or MUSAN? What’s the number of data loading workers and max duration?

from icefall.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.