bmcfee / muda Goto Github PK
View Code? Open in Web Editor NEWA library for augmenting annotated audio data
License: ISC License
A library for augmenting annotated audio data
License: ISC License
Pro:
Con:
The time stretcher should also apply to the time
and duration
Annotation object fields introduced in jams 0.2.1.
I'm trying to use the LinearPitchShift
deformer on a tag_open
JAMS annotation. Code looks something like this:
audiopath = '101415-3-0-2.wav'
jamspath = '101415-3-0-2.jams'
jorig = muda.load_jam_audio(jamspath, audiopath)
pitch = muda.deformers.LinearPitchShift(n_samples=3, lower=-1, upper=1)
jpitch = []
for j in pitch.transform(jorig):
jpitch.append(j)
If I try to do the same with the DRC deformer it seems to work OK, but with the pitch deformer I get:
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-48-1c0aa6d965af> in <module>()
1 jpitch = []
----> 2 for j in pitch.transform(jorig):
3 jpitch.append(j)
/usr/local/lib/python2.7/site-packages/muda/base.pyc in transform(self, jam)
141
142 for state in self.states(jam):
--> 143 yield self._transform(jam, state)
144
145 @property
/usr/local/lib/python2.7/site-packages/muda/base.pyc in _transform(self, jam, state)
109
110 if hasattr(self, 'audio'):
--> 111 self.audio(jam_w.sandbox.muda, state)
112
113 if hasattr(self, 'metadata'):
/usr/local/lib/python2.7/site-packages/muda/deformers/pitch.pyc in audio(mudabox, state)
75 mudabox._audio['y'] = pyrb.pitch_shift(mudabox._audio['y'],
76 mudabox._audio['sr'],
---> 77 state['n_semitones'])
78
79 @staticmethod
/usr/local/lib/python2.7/site-packages/pyrubberband/pyrb.pyc in pitch_shift(y, sr, n_steps, rbargs)
163 rbargs.setdefault('--pitch', n_steps)
164
--> 165 return __rubberband(y, sr, **rbargs)
/usr/local/lib/python2.7/site-packages/pyrubberband/pyrb.pyc in __rubberband(y, sr, **kwargs)
64 arguments.extend([infile, outfile])
65
---> 66 subprocess.check_call(arguments)
67
68 # Load the processed audio.
/usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.pyc in check_call(*popenargs, **kwargs)
533 check_call(["ls", "-l"])
534 """
--> 535 retcode = call(*popenargs, **kwargs)
536 if retcode:
537 cmd = kwargs.get("args")
/usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.pyc in call(*popenargs, **kwargs)
520 retcode = call(["ls", "-l"])
521 """
--> 522 return Popen(*popenargs, **kwargs).wait()
523
524
/usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.pyc in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags)
708 p2cread, p2cwrite,
709 c2pread, c2pwrite,
--> 710 errread, errwrite)
711 except Exception:
712 # Preserve original exception in case os.close raises.
/usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.pyc in _execute_child(self, args, executable, preexec_fn, close_fds, cwd, env, universal_newlines, startupinfo, creationflags, shell, to_close, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite)
1325 raise
1326 child_exception = pickle.loads(data)
-> 1327 raise child_exception
1328
1329
OSError: [Errno 2] No such file or directory
Our dependencies and tests are a bit out of date at this point, and we should do the following:
This is a regression introduced by #70 -- randomized deformers now have a random state field (deformer.rng
) which is of type RandomState
. This type is not JSON-serializable. When we go to save the output of a deformation (via JAMS), the encoder fails.
The easiest fix here is probably something like the following:
rng
fields to _rng
so that JAMS skips them during serializationrng
that is serialized. Raise a warning about serialization if the input is not serializable.Since this breaks the object model a little bit, we'll need to bump the version number.
for specifying deformation pipelines and parameters
Analogous to pipeline, but allows fan-out of a deformation over a container of deformations.
MUDA
relies heavily on external command line libraries such as rubberband
and sox
(lightly wrapped in pyrubberband
and pysox
) for core deformations such as time-stretch
, pitch-shift
and drc
. These system library wrappers work by writing the transformed signal to disk and then reading it back from disk into memory (presumably to feed an ML algorithm).
The external system call and particularly the additional read-write step introduce a large overhead in highly distributed/multithreaded out-of-core data pipelines. Would it not make sense to either a) allow an option to do an analagous deformation using in-memory python library (for example librosa
) or b) replace the external system call altogether with an in-memory transformation?
Hello,
I installed muda 0.2.0 successfully, but when I import it I get the following error:
import muda
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python2.7/dist-packages/muda/init.py", line 5, in
from . import deformers
File "/usr/local/lib/python2.7/dist-packages/muda/deformers/init.py", line 7, in
from .pitch import *
File "/usr/local/lib/python2.7/dist-packages/muda/deformers/pitch.py", line 6, in
import librosa
File "/usr/local/lib/python2.7/dist-packages/librosa/init.py", line 12, in
from . import core
File "/usr/local/lib/python2.7/dist-packages/librosa/core/init.py", line 104, in
from .time_frequency import * # pylint: disable=wildcard-import
File "/usr/local/lib/python2.7/dist-packages/librosa/core/time_frequency.py", line 10, in
from ..util.exceptions import ParameterError
File "/usr/local/lib/python2.7/dist-packages/librosa/util/init.py", line 67, in
from .utils import * # pylint: disable=wildcard-import
File "/usr/local/lib/python2.7/dist-packages/librosa/util/utils.py", line 111, in
def valid_audio(y, mono=True):
File "/usr/local/lib/python2.7/dist-packages/librosa/cache.py", line 49, in wrapper
if self.cachedir is not None and self.level >= level:
File "/usr/local/lib/python2.7/dist-packages/joblib/memory.py", line 847, in cachedir
DeprecationWarning, stacklevel=2)
TypeError: expected string or buffer
How can I fix it?
I just pip-installed muda into my virtualenv, and drc_presets.json
was missing. I had a quick glimpse into setup.py
, and it seems to be omitted. One option is to specify the package_data
option, see http://stackoverflow.com/questions/1612733/including-non-python-files-with-setup-py
Following Ich's comments offline: for full reproducibility, we should include muda.version in the output jams muda sandbox.
While adding tests for #40, I noticed what seem to be a couple of errors in the test_deformers.py
.
In test_linear_pitchshift
,
shouldn't assert lower <= d_tones <= 2.0**upper
be assert lower <= d_tones <= upper
?
also, in __test_effect
,
I think you intend for there to be an assert there, correct? i.e. ann_orig == ann_new
-> assert ann_orig == ann_new
#1 lists several deformations left to implement here, and we currently rely on some ad-hoc constellation of backends to implement what we have now.
It looks like pedalboard is on track to cover a large swath of the functionality we need from the audio side. Switching our backend processing over will simplify things, make it easier to cover more transformations, and make the whole project a bit more maintainable.
In the short term, it ought to be an easy switch. The only thing to be careful of is the parallel implementation of annotation transformations to match the audio.
In the longer term, it might be worth doing some kind of deferred processing of audio instead of making intermediate copies of the signal. The present implementation doesn't support this, but it could be more efficient if we provide some kind of lazy evaluation / pedalboard constructor that only generates audio when necessary.
Hi, Brian it's me again. I tried to use the Dynamic range compression to deform some audio signal and met some error, could you help me figure it out? Thanks!
def augment_features(parent_dir,sub_dirs,file_ext="*.wav"):
for l, sub_dir in enumerate(sub_dirs):
for fn in glob.glob(os.path.join(parent_dir, sub_dir, file_ext)):
name = fn.split('/')[2].split('.')[0]
jam = jams.load('audio/7061-6-0-0_bgnoise0.jams')
muda.load_jam_audio(jam, fn)
#pitch shift1
pitch1 = muda.deformers.LinearPitchShift(n_samples=4,lower=-2,upper=2)
for i, jam_out in enumerate(pitch1.transform(jam)):
muda.save('audio1/test/'+name+'_p1_{:02d}.wav'.format(i),'audio1/test/'+name+'_p1_{:02d}.jams'.format(i), jam_out)
#pitch shift2
pitch2 = muda.deformers.LinearPitchShift(n_samples=4,lower=-4,upper=4)
for i, jam_out in enumerate(pitch2.transform(jam)):
muda.save('audio1/test/'+name+'_p2_{:02d}.wav'.format(i),'audio1/test/'+name+'_p2_{:02d}.jams'.format(i), jam_out)
#time stetching
tstretch = muda.deformers.LogspaceTimeStretch(n_samples=4,lower=-3.5,upper=3.5)
for i, jam_out in enumerate(tstretch.transform(jam)):
muda.save('audio1/test/'+name+'_ts_{:02d}.wav'.format(i),'audio1/test/'+name+'_ts_{:02d}.jams'.format(i), jam_out)
#DRC
drc = muda.deformers.DynamicRangeCompression(preset=['radio','film standard', 'speech', 'radio'])
for i, jam_out in enumerate(drc.transform(jam)):
muda.save('audio1/test/'+name+'_drc_{:02d}.wav'.format(0),'audio1/test/'+name+'_drc_{:02d}.jams'.format(0), jam_out)
parent_dir = "audio1"
for k in range(1,11):
fold_name = 'fold' + str(k)
augment_features(parent_dir,[fold_name])
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-16-9541ffff5e33> in <module>()
36 for k in range(1,11):
37 fold_name = 'fold' + str(k)
---> 38 augment_features(parent_dir,[fold_name])
<ipython-input-16-9541ffff5e33> in augment_features(parent_dir, sub_dirs, file_ext)
25 #DRC
26 drc = muda.deformers.DynamicRangeCompression(preset=['radio','film standard', 'speech', 'radio'])
---> 27 for i, jam_out in enumerate(drc.transform(jam)):
28 muda.save('audio1/test/'+name+'_drc_{:02d}.wav'.format(0),'audio1/test/'+name+'_drc_{:02d}.jams'.format(0), jam_out)
29 #for i, jam_out in enumerate(drc.transform(jam)):
/home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/site-packages/muda-0.1.2-py2.7.egg/muda/base.pyc in transform(self, jam)
142
143 for state in self.states(jam):
--> 144 yield self._transform(jam, state)
145
146 @property
/home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/site-packages/muda-0.1.2-py2.7.egg/muda/base.pyc in _transform(self, jam, state)
110
111 if hasattr(self, 'audio'):
--> 112 self.audio(jam_w.sandbox.muda, state)
113
114 if hasattr(self, 'metadata'):
/home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/site-packages/muda-0.1.2-py2.7.egg/muda/deformers/sox.pyc in audio(mudabox, state)
146 mudabox._audio['y'] = drc(mudabox._audio['y'],
147 mudabox._audio['sr'],
--> 148 state['preset'])
/home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/site-packages/muda-0.1.2-py2.7.egg/muda/deformers/sox.pyc in drc(y, sr, preset)
91 '''
92
---> 93 return __sox(y, sr, 'compand', *PRESETS[preset])
94
95
/home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/site-packages/muda-0.1.2-py2.7.egg/muda/deformers/sox.pyc in __sox(y, sr, *args)
57 arguments.extend(args)
58
---> 59 subprocess.check_call(arguments)
60
61 y_out, sr = psf.read(outfile)
/home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/subprocess.pyc in check_call(*popenargs, **kwargs)
534 check_call(["ls", "-l"])
535 """
--> 536 retcode = call(*popenargs, **kwargs)
537 if retcode:
538 cmd = kwargs.get("args")
/home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/subprocess.pyc in call(*popenargs, **kwargs)
521 retcode = call(["ls", "-l"])
522 """
--> 523 return Popen(*popenargs, **kwargs).wait()
524
525
/home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/subprocess.pyc in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags)
709 p2cread, p2cwrite,
710 c2pread, c2pwrite,
--> 711 errread, errwrite)
712 except Exception:
713 # Preserve original exception in case os.close raises.
/home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/subprocess.pyc in _execute_child(self, args, executable, preexec_fn, close_fds, cwd, env, universal_newlines, startupinfo, creationflags, shell, to_close, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite)
1341 raise
1342 child_exception = pickle.loads(data)
-> 1343 raise child_exception
1344
1345
OSError: [Errno 2] No such file or directory
/home/js7561/dev/librosa/librosa/core/pitch.py:157: DeprecationWarning: object of type <class 'numpy.float64'> cannot be safely interpreted as an integer.
bins = np.linspace(-0.5, 0.5, np.ceil(1./resolution), endpoint=False)
Explicitly casting the output of np.ceil to int should fix this, yeah?
Brian Just started reading through your paper and decided to take a look at the code (rather than just read a paper that describes the code.r)
Your documentation (README) is a bit lacking on how to use your library. While an experienced developer can grok the code and make heads or tails of it; your best bet would be to include code samples of usage and explanation of your design patterns.
This is a great example of the level of detail (while somewhat overkill) for a good project:
http://kenwheeler.github.io/slick/
https://docs.angularjs.org/api
EDIT: I had a second look. I noticed you commented the code (which I think is fantastic) however, most practitioners would dive into the read me and clearly (markdown) readable usage.
This should be relatively easy to do.
The pipeline object allows for cascading of multiple deformers into a single chain.
Sometimes, we just want to apply a set of deformations independently, but there's no direct way to do that currently.
It would be easy to implement something analogous to a sklearn feature union that does round-robin sampling from each deformer.
In order to fully specify the BackgroundNoise deformation in the stored state, we need to know the start and stop sample indices, not just the weight and the file.
In my current project, my jams files completely specify my audio, and at training time the audio can be synthesized from the jams file. I'd ideally augment these jams files with muda without having to synthesize them first, and then I'd simply save the muda-augmented jams files. At training time, I'd synthesize the audio and process the muda deformations.
Is it possible to use muda without passing the audio initial audio file? It seems right now, if I don't use muda.load_jam_audio() to process my jams file (it just adds the empty history and library versions to the jam?), it errors when I call the transform method of my pipeline.
Is there a reason muda needs the audio file before actually processing the audio?
Simple(ish) deformers (many from audio degradation toolbox):
duration == 0
and duplicate them at some random offset and degradation in confidenceAdvanced deformers:
Title says it all, but here's how:
>>> J.sandbox['audio'] = {'y': y, 'sr': sr, 'abce_history': [] }
This way, each transformer can have its own specified order of operations.
Stochastic iterators can be more simply designed with multiple inheritance.
A special state information function can be abstracted out, and all transformers (iterable or not) can be rewritten to access parameters through the state object. This way, sampling and transformation become separate.
This might even make more sense as a context manager.
I am trying to use these (https://github.com/justinsalamon/UrbanSound8K-JAMS) jams files to deform sound files.
I am trying to use BaseTransformer after creating the jams object from the jams file and the corresponding sound file. Like this,
j_orig = muda.load_jam_audio('orig.jams', 'orig.ogg')
deformer = muda.base.BaseTransformer()
for jam_out in deformer.transform(jam_in):
process(jam_out)
But when I do this I am getting NotImplementedError
.
If I had to create deformations by creating objects from muda.deformers.*
classes. Then what is the point of loading annotated jams files? Please help me understand the process.
JAMS is likely to drop the pandas dataframe backing in the near future. Even in the short-term, pandas 0.20 breaks a variety of things involving in-place manipulation, so we should really just get out ahead of it and do things properly.
A simple deformation to take an input object (mudabox) and carve it into a sequence of clips, of size as specified by the user.
I probably missed something very basic. So I was trying to test the muda code examples from the documentation as follows.
jams_obj = muda.load_jam_audio(jam_file, song_file)
pitch = muda.deformers.LinearPitchShift(n_samples=5, lower=-2, upper=2)
for i, jam_out in pitch.transform(jams_obj):
muda.save('output_{:02d}.wav'.format(i),
'output_{:02d}.jams'.format(i),
jam_out)
and I encountered the following error message:
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-11-589bfe4aecc5> in <module>()
----> 1 for i, jam_out in pitch.transform(jams_obj):
2 muda.save('output_{:02d}.ogg'.format(i),
3 'output_{:02d}.jams'.format(i),
4 jam_out)
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/muda/base.pyc in transform(self, jam)
141
142 for state in self.states(jam):
--> 143 yield self._transform(jam, state)
144
145 @property
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/muda/base.pyc in _transform(self, jam, state)
109
110 if hasattr(self, 'audio'):
--> 111 self.audio(jam_w.sandbox.muda, state)
112
113 if hasattr(self, 'metadata'):
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/muda/deformers/pitch.pyc in audio(mudabox, state)
75 mudabox._audio['y'] = pyrb.pitch_shift(mudabox._audio['y'],
76 mudabox._audio['sr'],
---> 77 state['n_semitones'])
78
79 @staticmethod
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pyrubberband/pyrb.pyc in pitch_shift(y, sr, n_steps, rbargs)
163 rbargs.setdefault('--pitch', n_steps)
164
--> 165 return __rubberband(y, sr, **rbargs)
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pyrubberband/pyrb.pyc in __rubberband(y, sr, **kwargs)
64 arguments.extend([infile, outfile])
65
---> 66 subprocess.check_call(arguments)
67
68 # Load the processed audio.
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.pyc in check_call(*popenargs, **kwargs)
533 check_call(["ls", "-l"])
534 """
--> 535 retcode = call(*popenargs, **kwargs)
536 if retcode:
537 cmd = kwargs.get("args")
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.pyc in call(*popenargs, **kwargs)
520 retcode = call(["ls", "-l"])
521 """
--> 522 return Popen(*popenargs, **kwargs).wait()
523
524
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.pyc in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags)
708 p2cread, p2cwrite,
709 c2pread, c2pwrite,
--> 710 errread, errwrite)
711 except Exception:
712 # Preserve original exception in case os.close raises.
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.pyc in _execute_child(self, args, executable, preexec_fn, close_fds, cwd, env, universal_newlines, startupinfo, creationflags, shell, to_close, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite)
1333 raise
1334 child_exception = pickle.loads(data)
-> 1335 raise child_exception
1336
1337
OSError: [Errno 2] No such file or directory
Please give me any pointers on what I have been missing... I checked the content of jams_obj and the audio is loaded.
At this point, we only use it for a printing helper function used by __repr__
functions. Since that function itself depends only on numpy and six, we may as well copy it over and drop the import.
It would be helpful to be able to pass jams kwargs to muda's load_jam_audio()
and save()
methods. Example use case: I want to skip json validation upon loading (via muda.load_jam_audio()
) or saving (via muda.save()
).
The attempt to produce a time stretch transformation to an audio file raises a type error exception due to a division by a None type.
import jams
import muda
audiofile = "D:/temp/test.mp3"
jam = jams.JAMS()
jam = muda.jam_pack(jam)
jam = muda.core.load_jam_audio(jam, audiofile)
# this works
P = muda.deformers.PitchShift(n_semitones=5)
out_jams = list(P.transform(jam))
# this not
T = muda.deformers.TimeStretch(rate=2.0)
out_jams = list(T.transform(jam))
produces the following error
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-30-603bfe0e77bc> in <module>()
13
14 T = muda.deformers.TimeStretch(rate=2.0)
---> 15 out_jams = list(T.transform(jam))
C:\Anaconda\lib\site-packages\muda-0.1.1-py2.7.egg\muda\base.pyc in transform(self, jam)
141
142 for state in self.states(jam):
--> 143 yield self._transform(jam, state)
144
145 @property
C:\Anaconda\lib\site-packages\muda-0.1.1-py2.7.egg\muda\base.pyc in _transform(self, jam, state)
112
113 if hasattr(self, 'metadata'):
--> 114 self.metadata(jam_w.file_metadata, state)
115
116 # Walk over the list of deformers
C:\Anaconda\lib\site-packages\muda-0.1.1-py2.7.egg\muda\deformers\time.pyc in metadata(metadata, state)
40 def metadata(metadata, state):
41 # Deform the metadata
---> 42 metadata.duration /= state['rate']
43
44 @staticmethod
TypeError: unsupported operand type(s) for /=: 'NoneType' and 'float'
Probably we want to depend on librosa 0.4(-dev) and enable caching on a per-object basis.
Librosa 0.7 deprecated some features that we use (IO) to be removed in 0.8. We should future-proof muda against this.
Hi:
I would like to do data-augmentation for my bunch of wav files.
Can I use MUDA without JAMS file?
Thanks!
Hello!
I am reproducing a paper by Salomon & Bello, and need to make transformations according to the jams files in UrbanSound8K-JAMS, for which I am using muda.replay()
.
It seems that they used a previous version of Background noise which didn't use start and stop parameters, probably prior to this commit. I am thinking of just setting a default value for the case where there is no start and stop in the status, something like this:
try:
start = state['start']
stop = state['stop']
except KeyError:
start = 0
stop = len(mudabox._audio['y'])
I will post if this works fine latter on. Would appreciate some feedback on this.
The PitchShift
deformer only takes a single semitone parameter (n_semitones
), which makes it a little awkward to work with when you want to perform multiple pitch shifts. LinearPitchShift
can perform multiple shifts, but they have to be linearly spaced, which might not be the desired functionality (e.g. I may want n_semitones = [ -2, -1, 1, 2]
).
It would be nice if PitchShift
would accept (in addition to a single value) a list for the n_semitones
parameter, in which case it would generate an output audio/jams for every pitch shift value in the list (times n_samples
). This would make the behavior more consistent with other deformers (e.g. DynamicRangeCompression
), and would allow (what I'm really after) writing more generic code that can apply a deformer agnostically of which deformer it actually is because the deformer is fully defined during initialization.
Currently, JAMS objects are being used via the top-level sandbox to ferry data through deformation pipelines. This is a little clunky for a few reasons, some more obvious than others. For my part, a big one is transforming JAMS without audio / transforming audio without JAMS.
The important thing to note though is that the JAMS object is pretty powerful, which makes it super easy to do things with and to it. We can't say the same for the audio signal, and the JAMS object doesn't (and shouldn't) offer similar functionality for wrangling muda history, for example.
I'd be keen to encapsulate audio and annotation data as separate attributes of a Payload
object (or what have you) that can pass through the deformer pipeline agnostically. Putting some smarts into the different containers will also make it easier to introduce other audio deformations later, like stereo / spatialization, and keep good records on applied deformations.
And, as another win (in my book at least), it could allow us to leverage different audio reading/writing backends, which can be justifiable in different scenarios.
thoughts?
Make a script to index all generated jams according to their deformation parameters and source id
Currently, muda.save
is a destructive operation: when the jam is serialized, we first pop out the audio buffer jam.sandbox.muda['y']
to avoid serializing the audio. This is obviously bad.
JObject serialization already skips fields that start with _
.
A simple fix here would be the following:
jam_pack
to construct muda
as a JObject (or Sandbox) rather than a dictmuda._audio['y']
and muda._audio['sr']
This way, muda._audio
will be skipped during serialization, and save can be a non-destructive operation.
I tried to run the muda code example from documentaion:
import muda
import librosa
clip=muda.load_jam_audio('audio/7061-6-0-0_bgnoise0.jams','audio/6902-2-0-7.wav')
pitch = muda.deformers.LinearPitchShift(n_samples=5,lower=-1,upper=1)
for i, jam_out in pitch.transform(clip):
muda.save('output_{:02d}.wav'.format(i),'output_{:02d}.jams'.format(i),jam_out) `
but this error occurs:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-7-e235d022c3b3> in <module>()
1 pitch = muda.deformers.LinearPitchShift(n_samples=5,lower=-1,upper=1)
----> 2 for i, jam_out in pitch.transform(clip):
3 muda.save('output_{:02d}.wav'.format(i),'output_{:02d}.jams'.format(i),jam_out)
/home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/site-packages/muda/base.pyc in transform(self, jam)
140 '''
141
--> 142 for state in self.states(jam):
143 yield self._transform(jam, state)
144
/home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/site-packages/muda/deformers/pitch.pyc in states(self, jam)
251 endpoint=True)
252
--> 253 for state in AbstractPitchShift.states(self, jam):
254 for n_semitones in shifts:
255 state['n_semitones'] = n_semitones
/home/uri7910/anaconda2/envs/tensorflow011/lib/python2.7/site-packages/muda/deformers/pitch.pyc in states(self, jam)
67 def states(self, jam):
68 mudabox = jam.sandbox.muda
---> 69 state = dict(tuning=librosa.estimate_tuning(y=mudabox._audio['y'],
70 sr=mudabox._audio['sr']))
71 yield state
AttributeError: 'dict' object has no attribute '_audio'
This is minor, but the default values for the LogspaceTimeStretch deformer are 0.8 and 1.2. If one doesn't read the docs carefully one might assume this means stretching time by a factor of 0.8 (make audio shorter) to 1.2 (make audio longer), though in fact these values correspond to time-stretch factors between 1.74 and 2.3 (2^lower -- 2^upper), which seems somewhat arbitrary?
Setting the default values to -0.3 and +0.3 (for example) would give stretch factors in the range [0.81, 1.23] which perhaps makes more sense as default values?
I've got the following error when I run this code on Spyder in Windows 10.
Did I miss anything from installation?
Thanks
drc = muda.deformers.DynamicRangeCompression(preset=preset)
for i, jam_out in enumerate(drc.transform(j_orig)):
....
File "<ipython-input-36-e0cb52928d80>", line 1, in <module>
runfile('Z:/tingyao.chen/2018_0503_ASC/muda/muda_test.py', wdir='Z:/tingyao.chen/2018_0503_ASC/muda')
File "C:\Users\yao\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile
execfile(filename, namespace)
File "C:\Users\yao\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "Z:/tingyao.chen/2018_0503_ASC/muda/muda_test.py", line 23, in <module>
for i, jam_out in enumerate(drc.transform(j_orig)):
File "C:\Users\yao\Anaconda3\lib\site-packages\muda\base.py", line 145, in transform
yield self._transform(jam, state)
File "C:\Users\yao\Anaconda3\lib\site-packages\muda\base.py", line 113, in _transform
self.audio(jam_w.sandbox.muda, state)
File "C:\Users\yao\Anaconda3\lib\site-packages\muda\deformers\sox.py", line 148, in audio
state['preset'])
File "C:\Users\yao\Anaconda3\lib\site-packages\muda\deformers\sox.py", line 93, in drc
return __sox(y, sr, 'compand', *PRESETS[preset])
File "C:\Users\yao\Anaconda3\lib\site-packages\muda\deformers\sox.py", line 59, in __sox
subprocess.check_call(arguments)
File "C:\Users\yao\Anaconda3\lib\subprocess.py", line 286, in check_call
retcode = call(*popenargs, **kwargs)
File "C:\Users\yao\Anaconda3\lib\subprocess.py", line 267, in call
with Popen(*popenargs, **kwargs) as p:
File "C:\Users\yao\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 210, in __init__
super(SubprocessPopen, self).__init__(*args, **kwargs)
File "C:\Users\yao\Anaconda3\lib\subprocess.py", line 709, in __init__
restore_signals, start_new_session)
File "C:\Users\yao\Anaconda3\lib\subprocess.py", line 997, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] 系統找不到指定的檔案。
I have a dataset of audio clips of the same length. Half of these clips are positive (contain a bird flight call), while the other half is negative (contain only background noise).
I want to augment the dataset by mixing clips together, without changing the label.
But I ran into an error in 'muda.deformers.background.sample_clip_indices'. If I understand the stack trace correctly (see below my signature), the error happens when executing
start = np.random.randint(0, len(soundf) - n_target)
with len(soundf) - n_target
equal to zero.
I made a Gist to reproduce the bug:
https://gist.github.com/lostanlen/15fe9c879fdd24fe9023fa430314cd51
It disappears when the difference in lengths is strictly larger than zero.
Is this expected behavior? It seems to me that my issue could be fixed with
if len(soundf) > n_target:
start = np.random.randint(0, len(soundf) - n_target)
else:
start = 0
Best,
Vincent.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-43-9ecbf36219b3> in <module>()
39 # Create short deformer
40 short_deformer = muda.deformers.BackgroundNoise(files=[short_noise_path])
---> 41 short_jam_transformer = next(short_deformer.transform(jam_original)) # error
/Users/vl238/miniconda3/lib/python3.5/site-packages/muda/base.py in transform(self, jam)
142 '''
143
--> 144 for state in self.states(jam):
145 yield self._transform(jam, state)
146
/Users/vl238/miniconda3/lib/python3.5/site-packages/muda/deformers/background.py in states(self, jam)
154 for fname in self.files:
155 for _ in range(self.n_samples):
--> 156 start, stop = sample_clip_indices(fname, len(mudabox._audio['y']), mudabox._audio['sr'])
157 yield dict(filename=fname,
158 weight=np.random.uniform(low=self.weight_min,
/Users/vl238/miniconda3/lib/python3.5/site-packages/muda/deformers/background.py in sample_clip_indices(filename, n_samples, sr)
40
41 # Draw a random clip
---> 42 start = np.random.randint(0, len(soundf) - n_target)
43 stop = start + n_target
44
mtrand.pyx in mtrand.RandomState.randint (numpy/random/mtrand/mtrand.c:16117)()
ValueError: low >= high
At least two ideas jump out at me re: reproducibility:
RandomDoAThing
deformers could optionally take seed
params, but always use one internally (and serialize accordingly).state
, which isn't the case for RandomDoAThing
deformers, or (b) there's a higher-level object that combines state and pipeline as different objects. The difference here is small (and maybe semantic), but it's a difference between a class and an instance (the pipeline is the class, the state is the instance). This might have interesting repercussions for the design of the Pipeline
, which is perhaps more aptly called a PipelineFactory
.please yell if any of this is unclear, I'm kind of stream-of-consciousness working through the idea.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.