madebyollin / acapellabot Goto Github PK

View Code? Open in Web Editor NEW

205.0 205.0 44.0 6.22 MB

Acapella Extraction with a ConvNet

Home Page: http://madebyoll.in/posts/cnn_acapella_extraction/

Python 100.00%

acapellabot's Introduction

madebyollin

GitHub Pages site for personal blog.

acapellabot's People

Contributors

Stargazers

Watchers

acapellabot's Issues

how to extract only the music part?

The current model will output vocal, can you make a change to just output the backgroun music als?
Thanks

Indexing elements must be in increasing order

when I run the following code (demo.mp3 is my example music)

python acapellabot.py demo.mp3

got errors :

Traceback (most recent call last):
  File "C:\Program Files\Anaconda3\lib\site-packages\h5py\_hl\selections.py", line 85, in select
    int(a)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'list'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "acapellabot.py", line 148, in <module>
    acapellabot.loadWeights(args.weights)
  File "acapellabot.py", line 88, in loadWeights
    self.model.load_weights(path)
  File "C:\Program Files\Anaconda3\lib\site-packages\keras\engine\topology.py", line 2500, in load_weights
    load_weights_from_hdf5_group(f, self.layers)
  File "C:\Program Files\Anaconda3\lib\site-packages\keras\engine\topology.py", line 2900, in load_weights_from_hdf5_group
    original_backend)
  File "C:\Program Files\Anaconda3\lib\site-packages\keras\engine\topology.py", line 2841, in preprocess_weights_for_loading
    weights[0] = conv_utils.convert_kernel(weights[0])
  File "C:\Program Files\Anaconda3\lib\site-packages\keras\utils\conv_utils.py", line 86, in convert_kernel
    return np.copy(kernel[slices])
  File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (C:\Minonda\conda-bld\h5py_1474482825505\work\h5py\_objects.c:2705)
  File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (C:\Minonda\conda-bld\h5py_1474482825505\work\h5py\_objects.c:2663)
  File "C:\Program Files\Anaconda3\lib\site-packages\h5py\_hl\dataset.py", line 462, in __getitem__
    selection = sel.select(self.shape, args, dsid=self.id)
  File "C:\Program Files\Anaconda3\lib\site-packages\h5py\_hl\selections.py", line 88, in select
    sel[args]
  File "C:\Program Files\Anaconda3\lib\site-packages\h5py\_hl\selections.py", line 356, in __getitem__
    if sorted(arg) != list(arg):
TypeError: unorderable types: NoneType() < int()

My environment as following:

python : 3.5.2
Tensorflow : 1.1.0
Keras : 2.0.3
librosa : 0.50
h5py : 2.6.0

In Window 10.

TypeError: 'float' object cannot be interpreted as an index

('\x1b[33m', 'Retrieved spectrogram; processing...', '\x1b[0m')
Traceback (most recent call last):
File "acapellabot.py", line 147, in
acapellabot.isolateVocals(f, args.fft, args.phase)
File "acapellabot.py", line 96, in isolateVocals
expandedSpectrogram = conversion.expandToGrid(spectrogram, self.peakDownscaleFactor)
File "/home/arka/Python/Working/VocalSeparation/vocal_separator/acapellabot-master/conversion.py", line 23, in expandToGrid
newSpectrogram = np.zeros((newX, newY))
TypeError: 'float' object cannot be interpreted as an index

TypeError: Indexing elements must be in increasing order

$ python acapellabot.py test.mp3
Using TensorFlow backend.
('\x1b[33m', 'Model has 668225 params', '\x1b[0m')
('\x1b[33m', "Weights provided; performing inference on ['test.mp3']...", '\x1b[0m')
('\x1b[1m', 'Loading weights', '\x1b[0m')
Traceback (most recent call last):
File "acapellabot.py", line 145, in
acapellabot.loadWeights(args.weights)
File "acapellabot.py", line 88, in loadWeights
self.model.load_weights(path)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 2500, in load_weights
load_weights_from_hdf5_group(f, self.layers)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 2900, in load_weights_from_hdf5_group
original_backend)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 2841, in preprocess_weights_for_loading
weights[0] = conv_utils.convert_kernel(weights[0])
File "/usr/local/lib/python2.7/dist-packages/keras/utils/conv_utils.py", line 86, in convert_kernel
return np.copy(kernel[slices])
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/tmp/pip_build_root/h5py/h5py/_objects.c:2840)
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/tmp/pip_build_root/h5py/h5py/_objects.c:2798)
File "/usr/local/lib/python2.7/dist-packages/h5py/_hl/dataset.py", line 474, in getitem
selection = sel.select(self.shape, args, dsid=self.id)
File "/usr/local/lib/python2.7/dist-packages/h5py/_hl/selections.py", line 90, in select
sel[args]
File "/usr/local/lib/python2.7/dist-packages/h5py/_hl/selections.py", line 361, in getitem
raise TypeError("Indexing elements must be in increasing order")
TypeError: Indexing elements must be in increasing order

My environment as following:
ubuntu 14.04
Python 2.7.6
tensorflow (1.0.1)
tensorflow-gpu (1.0.1)
Keras (2.0.3)
h5py (2.7.0)
librosa (0.5.0)

CorrMM issue

Hi,
i tried to running your coding but this error raised. I searched everywhere but cannot find an useful solution. I'm a fish, please help me with it. Thanks a lot. Since it's a memory issue, I am using a VM with 8G memory, deepin system.

Using Theano backend.
WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
 Model has 668225 params 
 Weights provided; performing inference on ['gem.wav']... 
 Loading weights 
 Attempting to isolate vocals from gem.wav 
 Retrieved spectrogram; processing... 
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/function_module.py", line 903, in __call__
    self.fn() if output_subset is None else\
RuntimeError: CorrMM failed to allocate working memory of 1 x 1024 x 2603755

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/wawang250/PycharmProjects/acapellabot/acapellabot.py", line 147, in <module>
    acapellabot.isolateVocals(f, args.fft, args.phase)
  File "/home/wawang250/PycharmProjects/acapellabot/acapellabot.py", line 98, in isolateVocals
    predictedSpectrogramWithBatchAndChannels = self.model.predict(expandedSpectrogramWithBatchAndChannels)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 1790, in predict
    verbose=verbose, steps=steps)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 1299, in _predict_loop
    batch_outs = f(ins_batch)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/theano_backend.py", line 1224, in __call__
    return self.function(*inputs)
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/function_module.py", line 917, in __call__
    storage_map=getattr(self.fn, 'storage_map', None))
  File "/usr/local/lib/python3.5/dist-packages/theano/gof/link.py", line 325, in raise_with_op
    reraise(exc_type, exc_value, exc_trace)
  File "/usr/lib/python3/dist-packages/six.py", line 685, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/function_module.py", line 903, in __call__
    self.fn() if output_subset is None else\
RuntimeError: CorrMM failed to allocate working memory of 1 x 1024 x 2603755

Apply node that caused the error: CorrMM{half, (2, 2), (1, 1), 1 False}(InplaceDimShuffle{0,3,1,2}.0, Subtensor{::, ::, ::int64, ::int64}.0)
Toposort index: 81
Inputs types: [TensorType(float32, 4D), TensorType(float32, 4D)]
Inputs shapes: [(1, 64, 769, 13525), (64, 64, 4, 4)]
Inputs strides: [(2662585600, 41602900, 54100, 4), (4, 256, -65536, -16384)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[Subtensor{int64:int64:int8, int64:int64:int8, int64:int64:int8, :int64:}(CorrMM{half, (2, 2), (1, 1), 1 False}.0, ScalarFromTensor.0, ScalarFromTensor.0, Constant{1}, Constant{0}, Constant{64}, Constant{1}, ScalarFromTensor.0, ScalarFromTensor.0, Constant{1}, ScalarFromTensor.0)]]

Backtrace when the node is created(use Theano flag traceback.limit=N to make it longer):
  File "/home/wawang250/PycharmProjects/acapellabot/acapellabot.py", line 130, in <module>
    acapellabot = AcapellaBot()
  File "/home/wawang250/PycharmProjects/acapellabot/acapellabot.py", line 31, in __init__
    conv = Conv2D(64, 4, strides=2, activation='relu', padding='same', use_bias=False)(convA)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/topology.py", line 603, in __call__
    output = self.call(inputs, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/keras/layers/convolutional.py", line 164, in call
    dilation_rate=self.dilation_rate)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/theano_backend.py", line 1913, in conv2d
    filter_dilation=dilation_rate)

HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

Acapellabot not responsing

downloaded and run python acapellabot.py track.mp3 but acapellabot not responding and no any error

how to train this model by myself?

Hello, how can I get myself to training this model? May I have your dataset? or how to make it by myself?

ERROR (theano.gof.opt)

Here's my error message：
ERROR (theano.gof.opt): Optimization failure due to: local_abstractconv_check
ERROR (theano.gof.opt): node: AbstractConv2d{border_mode='half', subsample=(1, 1), filter_flip=True, imshp=(None, 1, None, None), kshp=(64, 1, 3, 3)}(DimShuffle{0,3,1,2}.0, DimShuffle{3,2,0,1}.0)
ERROR (theano.gof.opt): TRACEBACK:
ERROR (theano.gof.opt): Traceback (most recent call last):
File "C:\Users\ywjys\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\gof\opt.py", line 1772, in process_node
replacements = lopt.transform(node)
File "C:\Users\ywjys\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\tensor\nnet\opt.py", line 402, in local_abstractconv_check
node.op.class.name)
AssertionError: AbstractConv2d Theano optimization failed: there is no implementation available supporting the requested options. Did you exclude both "conv_dnn" and "conv_gemm" from the optimizer? If on GPU, is cuDNN available and does the GPU support it? If on CPU, do you have a BLAS library installed Theano can link against?

Traceback (most recent call last):
File "C:/Users/ywjys/Desktop/AcapeTest/acapellabot.py", line 147, in
acapellabot.isolateVocals(f, args.fft, args.phase)
File "C:/Users/ywjys/Desktop/AcapeTest/acapellabot.py", line 98, in isolateVocals
predictedSpectrogramWithBatchAndChannels = self.model.predict(expandedSpectrogramWithBatchAndChannels)
File "C:\Users\ywjys\Desktop\AcapeTest\venv1\lib\site-packages\keras\engine\training.py", line 1164, in predict
self._make_predict_function()
File "C:\Users\ywjys\Desktop\AcapeTest\venv1\lib\site-packages\keras\engine\training.py", line 554, in _make_predict_function
**kwargs)
File "C:\Users\ywjys\Desktop\AcapeTest\venv1\lib\site-packages\keras\backend\theano_backend.py", line 1397, in function
return Function(inputs, outputs, updates=updates, **kwargs)
File "C:\Users\ywjys\Desktop\AcapeTest\venv1\lib\site-packages\keras\backend\theano_backend.py", line 1383, in init
**kwargs)
File "C:\Users\ywjys\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\compile\function.py", line 320, in function
output_keys=output_keys)
File "C:\Users\ywjys\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\compile\pfunc.py", line 479, in pfunc
output_keys=output_keys)
File "C:\Users\ywjys\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\compile\function_module.py", line 1776, in orig_function
output_keys=output_keys).create(
File "C:\Users\ywjys\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\compile\function_module.py", line 1456, in init
optimizer_profile = optimizer(fgraph)
File "C:\Users\ywjys\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\gof\opt.py", line 101, in call
return self.optimize(fgraph)
File "C:\Users\ywjys\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\gof\opt.py", line 89, in optimize
ret = self.apply(fgraph, *args, **kwargs)
File "C:\Users\ywjys\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\gof\opt.py", line 230, in apply
sub_prof = optimizer.optimize(fgraph)
File "C:\Users\ywjys\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\gof\opt.py", line 89, in optimize
ret = self.apply(fgraph, *args, **kwargs)
File "C:\Users\ywjys\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\gof\opt.py", line 1879, in apply
nb += self.process_node(fgraph, node)
File "C:\Users\ywjys\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\gof\opt.py", line 1777, in process_node
lopt, node)
File "C:\Users\ywjys\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\gof\opt.py", line 1673, in warn_inplace
return NavigatorOptimizer.warn(exc, nav, repl_pairs, local_opt, node)
File "C:\Users\ywjys\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\gof\opt.py", line 1659, in warn
raise exc
File "C:\Users\ywjys\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\gof\opt.py", line 1772, in process_node
replacements = lopt.transform(node)
File "C:\Users\ywjys\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\tensor\nnet\opt.py", line 402, in local_abstractconv_check
node.op.class.name)
AssertionError: AbstractConv2d Theano optimization failed: there is no implementation available supporting the requested options. Did you exclude both "conv_dnn" and "conv_gemm" from the optimizer? If on GPU, is cuDNN available and does the GPU support it? If on CPU, do you have a BLAS library installed Theano can link against?

Process finished with exit code 1
Does anyone know what to do ?

ValueError: all the input array dimensions except for the concatenation axis must match exactly

I have made below change in conversion.py to resolve Error - "TypeError: 'float' object cannot be interpreted as an index".
ceil(spectrogram.shape[1] / gridSize) * gridSize --> spectrogram.shape[1]
ceil(spectrogram.shape[0] / gridSize) * gridSize --> spectrogram.shape[0]

After running, i am getting below error -

Traceback (most recent call last):
File "acapellabot.py", line 147, in
acapellabot.isolateVocals(f, args.fft, args.phase)
File "acapellabot.py", line 98, in isolateVocals
predictedSpectrogramWithBatchAndChannels = self.model.predict(expandedSpectrogramWithBatchAndChannels)
File "/home/sameer/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1169, in predict
steps=steps)
File "/home/sameer/.local/lib/python3.6/site-packages/keras/engine/training_arrays.py", line 294, in predict_loop
batch_outs = f(ins_batch)
File "/home/sameer/.local/lib/python3.6/site-packages/keras/backend/theano_backend.py", line 1388, in call
return self.function(*inputs)
File "/home/sameer/.local/lib/python3.6/site-packages/theano/compile/function_module.py", line 917, in call
storage_map=getattr(self.fn, 'storage_map', None))
File "/home/sameer/.local/lib/python3.6/site-packages/theano/gof/link.py", line 325, in raise_with_op
reraise(exc_type, exc_value, exc_trace)
File "/home/sameer/.local/lib/python3.6/site-packages/six.py", line 692, in reraise
raise value.with_traceback(tb)
File "/home/sameer/.local/lib/python3.6/site-packages/theano/compile/function_module.py", line 903, in call
self.fn() if output_subset is None else
ValueError: all the input array dimensions except for the concatenation axis must match exactly
Apply node that caused the error: Join(TensorConstant{3}, Reshape{4}.0, Elemwise{Composite{(i0 * ((i1 + i2) + Abs((i1 + i2))))}}[(0, 1)].0)
Toposort index: 207
Inputs types: [TensorType(int8, scalar), TensorType(float32, 4D), TensorType(float32, 4D)]
Inputs shapes: [(), (1, 386, 198, 128), (1, 385, 198, 64)]
Inputs strides: [(), (39131136, 101376, 512, 4), (304920, 792, 4, 304920)]
Inputs values: [array(3, dtype=int8), 'not shown', 'not shown']
Outputs clients: [[InplaceDimShuffle{0,3,1,2}(Join.0)]]

Fitting step runs out of memory on GPU's

Running python acapellabot.py sample.mp3 --weights weights.h5 on the pretrained model works on CPU's, but crashes on GPU's due to memory overflows. I suspect it's something to do with the concatenation steps within the keras model.

Tested on a Tesla K80:


Error when tring to find the memory information on the GPU: an illegal memory access was encountered
Error freeing device pointer 0x1205ae0000 (an illegal memory access was encountered). Driver report 0 bytes free and 0 bytes total 
CudaNdarray_uninit: error freeing self->devdata. (self=0x7f5d4a24bcb0, self->devata=0x1205ae0000)
Error when tring to find the memory information on the GPU: an illegal memory access was encountered
Error freeing device pointer 0x1205560000 (an illegal memory access was encountered). Driver report 0 bytes free and 0 bytes total 
device_free: cudaFree() returned an error, but there is already an Python error set. This happen during the clean up when there is a first error and the CUDA driver is in a so bad state that it don't work anymore. We keep the previous error set to help debugging it.CudaNdarray_uninit: error freeing self->devdata. (self=0x7f5d4c54c7b0, self->devata=0x1205560000)
Error when trying to find the memory information on the GPU: an illegal memory access was encountered
Error allocating 863849472 bytes of device memory (an illegal memory access was encountered). Driver report 0 bytes free and 0 bytes total

Tested on a GTX 1070:

   File "pygpu/gpuarray.pyx", line 1501, in pygpu.gpuarray.pygpu_concatenate
  File "pygpu/gpuarray.pyx", line 427, in pygpu.gpuarray.array_concatenate
pygpu.gpuarray.GpuArrayException: b'cuMemAlloc: CUDA_ERROR_OUT_OF_MEMORY: out of memory'
Apply node that caused the error: GpuJoin(TensorConstant{3}, GpuReshape{4}.0, GpuElemwise{Composite{(i0 * ((i1 + i2) + Abs((i1 + i2))))}}[]<gpuarray>.0)
Toposort index: 418
Inputs types: [TensorType(int8, scalar), GpuArrayType<None>(float32, (False, False, False, False)), GpuArrayType<None>(float32, (False, False, False, False))]
Inputs shapes: [(), (1, 386, 8742, 128), (1, 386, 8742, 64)]
Inputs strides: [(), (1727698944, 4475904, 512, 4), (863849472, 2237952, 256, 4)]
Inputs values: [array(3, dtype=int8), 'not shown', 'not shown']
Outputs clients: [[InplaceGpuDimShuffle{0,3,1,2}(GpuJoin.0)]]

python data.py . gives IndexError: too many indices for array

Loading Data 
	 [Instrumental] Created spectrogram for 01 The Ringer.mp3 in key 1 with shape (769, 19406) 
	 Created 0 mashups for key 0 with 0 total slices so far 
	 Created 0 mashups for key 1 with 0 total slices so far 
	 Created 0 mashups for key 2 with 0 total slices so far 
	 Created 0 mashups for key 3 with 0 total slices so far 
	 Created 0 mashups for key 4 with 0 total slices so far 
	 Created 0 mashups for key 5 with 0 total slices so far 
	 Created 0 mashups for key 6 with 0 total slices so far 
	 Created 0 mashups for key 7 with 0 total slices so far 
	 Created 0 mashups for key 8 with 0 total slices so far 
	 Created 0 mashups for key 9 with 0 total slices so far 
	 Created 0 mashups for key 10 with 0 total slices so far 
	 Created 0 mashups for key 11 with 0 total slices so far 
Traceback (most recent call last):
  File "data.py", line 120, in <module>
    d = Data(sys.argv[1], 1536)
  File "data.py", line 50, in __init__
    self.load()
  File "data.py", line 108, in load
    self.x = np.array(self.x)[:, :, :, np.newaxis]
IndexError: too many indices for array

AssertionError: AbstractConv2d Theano optimization failed

I have been through the error of #1, and solved follow the answer. but, I come to a new error as follows:
AssertionError: AbstractConv2d Theano optimization failed: there is no implementation available supporting the requested options. Did you exclude both "conv_dnn" and "conv_gemm" from the optimizer? If on GPU, is cuDNN available and does the GPU support it? If on CPU, do you have a BLAS library installed Theano can link against?

I searched answers in google , but not helped. Does anyone know what to do ?

network construction question

Hi,

I came across your post below:
http://madebyoll.in/posts/cnn_acapella_extraction/

I am wondering how did you come up with the neural network below?

mashup = Input(shape=(None, None, 1), name='input')
convA = Conv2D(64, 3, activation='relu', padding='same')(mashup)
conv = Conv2D(64, 4, strides=2, activation='relu', padding='same', use_bias=False)(convA)
conv = BatchNormalization()(conv)

convB = Conv2D(64, 3, activation='relu', padding='same')(conv)
conv = Conv2D(64, 4, strides=2, activation='relu', padding='same', use_bias=False)(convB)
conv = BatchNormalization()(conv)

conv = Conv2D(128, 3, activation='relu', padding='same')(conv)
conv = Conv2D(128, 3, activation='relu', padding='same', use_bias=False)(conv)
conv = BatchNormalization()(conv)
conv = UpSampling2D((2, 2))(conv)

conv = Concatenate()([conv, convB])
conv = Conv2D(64, 3, activation='relu', padding='same')(conv)
conv = Conv2D(64, 3, activation='relu', padding='same', use_bias=False)(conv)
conv = BatchNormalization()(conv)
conv = UpSampling2D((2, 2))(conv)

conv = Concatenate()([conv, convA])
conv = Conv2D(64, 3, activation='relu', padding='same')(conv)
conv = Conv2D(64, 3, activation='relu', padding='same')(conv)
conv = Conv2D(32, 3, activation='relu', padding='same')(conv)
conv = Conv2D(1, 3, activation='relu', padding='same')(conv)
acapella = conv

Is there any reference starting point or is there any reasoning behind this?

Can you leave me an email address, so we can discuss more easily? Mine is [email protected]

madebyollin / acapellabot Goto Github PK

acapellabot's Introduction

madebyollin

acapellabot's People

Contributors

Stargazers

Watchers

Forkers

acapellabot's Issues

Recommend Projects

Recommend Topics

Recommend Org