Git Product home page Git Product logo

ai-research-code's People


akiohayakawa-sony avatar bacnguyencong-sony avatar fabiencardinaux avatar jx-huading avatar kazukiyoshiyama-sony avatar kenji-suzuki-s avatar krishnaw10 avatar qiiajia avatar siddharthnijhawan avatar srinidhi-srinivasa avatar takuyanarihira avatar takuyayashima avatar te-basavarajmurali avatar te-kevingeorge avatar tomonobutsujikawa avatar yukiooobuchi avatar


 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar


 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ai-research-code's Issues

[NVC-Net] About 16 kHz training and model convergence


Thank you for sharing your great work!

I'm using nvcnet to train a Japanese voice conversion model, I have two questions.

First, I try to adapt your code to 16 kHz wavs, I did the following two manipulations:

  1. changed sr in from 22050 into 16000
  2. changed segment_length in from 32768 into 16384
    The training goes well but the performance is bad even after 400 epochs.

I wonder if you have any idea on training nvcnet on 16 kHz wavs? Do I need any other modifications to ensure the training will go well ?

Second, could you share the value of g_loss_rec when the model converges?.
In my training the g_loss_rec converged to around 0.9 to 1.2, I'm not sure if this is what I should expect in model convergence.

Value error in nnabla on running x-umx on Rpi4, Raspberry Pi OS

On running x-umx on Rpi4, 8GB on Raspberry Pi OS i get the below error,

root@raspberrypi:/home/pi/x-umx# python3 --inputs ../Music/test_16k_S16_LE_stereo.wav --context cpu --model /home/pi/x-umx/x-umx.h5 --outdir /home/pi/x-umx/results
2021-01-30 16:37:37,007 [nnabla][INFO]: Initializing CPU extension...
Traceback (most recent call last):
File "", line 198, in
File "", line 170, in test
File "", line 84, in separate
mix_spec, msk, _ = unmix_target(audio_nn, test=True)
File "/home/pi/x-umx/", line 300, in call
lstm_out_bass = self.lstm(cross_1, nb_samples, "lstm_bass", test)
File "/home/pi/x-umx/", line 231, in lstm
bidirectional=not self.unidirectional, training=not test, dropout=0.4, name=scope_name)
File "", line 8, in lstm
File "/usr/local/lib/python3.7/dist-packages/nnabla/", line 1567, in lstm
return F.lstm(x, h, c, weight_l0=w0, weight=w, bias=b, num_layers=num_layers, dropout=dropout, bidirectional=bidirectional, training=training)
File "", line 3, in lstm
File "/usr/local/lib/python3.7/dist-packages/nnabla/", line 222, in lstm
return F.LSTM(ctx, num_layers, dropout, bidirectional, training)(*inputs, n_outputs=n_outputs, auto_forward=get_auto_forward(), outputs=outputs)
File "function.pyx", line 292, in
File "function.pyx", line 271, in nnabla.function.Function.cg_call
RuntimeError: value error in setup_impl
Failed num_outputs
== outputs.size(): inputs[0].shape[axis] must be the same number as the outputs. inputs[0].shape[axis]: 431, outputs: 2.

I have successfully manually built & installed nnabla & llvmlite.
The latter was really very difficult to build & install.
root@raspberrypi:/home/pi/x-umx# pip3 freeze | grep 'nnabla'

I think, now it is throwing error related to nnabla, of the input parameter size not equivalent to
output parameter size. Can you please suggest, where we need to set this nnabla files ?

Please help me.

Bad output sound quality

I noticed that output quality is much worse than in input file. Maybe there is some config, which cuts some frequencies from output file?

In spleeter there is issue like that, which can be resolved with simply config change.

Both pretrained models (openvino and default) provide same bad quality. Sounds like high frequencies are cutted off.

Thanks in advance!

[X-UMX] Bad performance when using --targets


I am doing some tests on the Google Collab you provide and I've seen that the performance varies a lot if I use the flag --targets vocals respect to if I just run the default test command:
!python --inputs $filename --out-dir results --model models/x-umx.h5

Why is this happening? My goal is to have audio + accompaniment

Many thanks in advance,


No pretrained models

I see now all links for pretrained models are 404, where can I get these models?

【NVC-Net】RuntimeError: target_specific error in backward_impl. Failed `status == CUDNN_STATUS_SUCCESS`: UNKNOWN

Hi, I try to train NVC-Net on single gpu, but I meet some errors as follows:

value error in query
Failed it != items_.end(): Any of [cudnn:float, cuda:float, cpu:float] could not be found in []

No communicator found. Running with a single process. If you run this with MPI processes, all processes will perform totally same.
2022-02-15 17:16:13,887 [nnabla][INFO]: Training data with 100 speakers.
2022-02-15 17:16:13,888 [nnabla][INFO]: DataSource with shuffle(True)
2022-02-15 17:16:13,934 [nnabla][INFO]: Using DataIterator
Running epoch=1 lr=0.00010
Error during backward propagation:
TanhCudaCudnn <-- ERROR
Traceback (most recent call last):
File "", line 99, in
File "", line 70, in run
Trainer(gen, gen_optim, dis, dis_optim, dataloader, rng, hp).run()
File "11_ai-research-code-master/nvcnet/", line 157, in run
File "11_ai-research-code-master/nvcnet/", line 197, in train_on_batch
File "_variable.pyx", line 826, in nnabla._variable.Variable.backward
RuntimeError: target_specific error in backward_impl

I had followed the install page:, but it does not work. Could you please give some suggestion?
My environments as follows:
CUDA11.0, cudnn 8.1.0, python 3.6.8

Thank you ! Look forward to your kind reply.

NVC-Net Training

Hi, thanks for releasing the code for NVC-Net. I've got two questions:

Firstly, when trying to train on multiple GPUs, I run into the following error:

Failed `it != items_.end()`: Any of [cudnn:float, cuda:float, cpu:float] could not be found in []
No communicator found. Running with a single process. If you run this with MPI processes, all processes will perform totally same.

which basically means it's only running on one GPU. In fact I get the same error simply by running the following

import nnabla.communicators as C
from nnabla.ext_utils import get_extension_context
ctx = get_extension_context("cudnn", device_id='0')

I know this is probably more of a nnabla issue but as a PyTorch user I'm not sure where to get help with nnabla.

Secondly, is it normal for the content preservation loss g_loss_con to be 0.0 for the first few epochs? I'm finding that the encoder basically encodes everything to the same vector in the hidden dimension, hence the loss is 0.0. For reference I'm also using the VCTK dataset processed with the given script with default parametres.

Thanks alot!

additional conda env dependencies needed

to get this working I had to make the following changes to environment-gpu.yml

name: open-unmix-nnabla-gpu


  • conda-forge

- pip

  • python=3.6
  • numpy=1.16
  • scikit-learn=0.21
  • tqdm=4.28
  • cudatoolkit=10.0
  • cudnn
  • ffmpeg
  • pip:
    - soundfile
    • musdb
    • norbert
    • resampy
    • nnabla
    • nnabla-ext-cuda100
    • pydub

test wavfile using cpu gives Segmentation fault


I am using a pre-train model in a CPU environment with --context cpu option. my mixed wave file is 2 min long and I have a 16GB ram and an octa-core cpu system.

I am trying to run the test command for the pre-trained model and it gives segmentation fault. please check the below command and logs.
command: python --input inputs/dm_mixed_vocal_and music_0002.wav --context cpu --model model/x-umx.h5 --outdir outputs/
2021-01-08 02:02:41,221 [nnabla][INFO]: Initializing CPU extension...
Segmentation fault (core dumped)

why this shows segmentation fault I don't know? Also, my system ram is not full at the time of segfault.
please suggest some ideas to resolve this issue.

Memory allocation failed

I tried to train with 2 GPUs by docker, but after one epoch, memory errors in allocation occur. I am not sure what to check and what's wrong possibly.

While running the d3net music seperation jupyter notebook in collab "AttributeError: module 'pynvml.nvml' has no attribute 'nvml_lib'" error is coming

2021-06-07 12:26:53,837 [nnabla][INFO]: Initializing CPU extension...
Traceback (most recent call last):
File "", line 96, in
File "", line 27, in run_separation
ctx = get_extension_context(args.context)
File "/usr/local/lib/python3.7/dist-packages/nnabla/", line 97, in get_extension_context
mod = import_extension_module(ext_name)
File "/usr/local/lib/python3.7/dist-packages/nnabla/", line 46, in import_extension_module
return importlib.import_module('.' + ext_name, 'nnabla_ext')
File "/usr/lib/python3.7/importlib/", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import
File "", line 983, in _find_and_load
File "", line 967, in _find_and_load_unlocked
File "", line 677, in _load_unlocked
File "", line 728, in exec_module
File "", line 219, in _call_with_frames_removed
File "/usr/local/lib/python3.7/dist-packages/nnabla_ext/cudnn/", line 18, in
import nnabla_ext.cuda
File "/usr/local/lib/python3.7/dist-packages/nnabla_ext/cuda/", line 114, in
File "/usr/local/lib/python3.7/dist-packages/nnabla_ext/cuda/", line 71, in check_gpu_compatibility
from nnabla.utils.nvml import pynvml
File "/usr/local/lib/python3.7/dist-packages/nnabla/utils/", line 39, in
File "/usr/local/lib/python3.7/dist-packages/nnabla/utils/", line 26, in load_nvml_for_win
if not (nvml.nvml_lib == None and sys.platform[:3] == "win"):
AttributeError: module 'pynvml.nvml' has no attribute 'nvml_lib'

【NVC-Net】ImportError: cannot open shared object file: No such file or directory


and the details on the GPU is -
GPU: Tesla P100-PCIE-16gb
driver version: 450.119.04
CUDA version: 11.0

I'm using Kaggle to try and run this and keep running into this error. I tried running the docker file but that has no success in successfully running either. I've tried many online methods to resolve this but none seem to be working unfortunately. Could anyone potentially help with this issue?

Adding additional speakers - transfer learning

Has anyone figured out a way to use this algorithm to due transfer learning?

Say I train with 100 speakers and want to train the model with an additional 20 speakers. It appears that you have to retrain from the start rather than adding a set of 20 new latent spaces and training this new data.

Anyone tried this? Would be great to be able to transfer what's been learned, but tough on GANs.


X-UMX - Separating using --context cpu takes very long

Thank you so much for open-sourcing X-UMX!

Is it in a usable state right now?

I followed your instructions to perform source separation with the model as listed here.

I ran with the flag --context cpu and separation is taking a very long time, it took over 15 minutes for separating a 3-minute track. I have 16 GB of RAM. Is CPU-based separation supposed to take this long?

Update: It ended up exhausting all my memory and crashed.

X-UMX gets stuck when training

Hello i am trying to train a model using X-UMX on a single gpu. I am using the 7 second preview version of musdb just for testing.

After compute dataset statistics reaches 100% the next line is stuck at 0%

This is everything i did

cd /home/ubuntu/Downloads/ai-research-code/x-umx
ubuntu@:/Downloads/ai-research-code/x-umx$ conda activate open-unmix-nnabla-gpu
(open-unmix-nnabla-gpu) ubuntu@-:~/Downloads/ai-research-code/x-umx$ python --output /home/ubuntu/Downloads/ai-research-code/x-umx/weights
2021-07-27 23:18:56,894 [nnabla][INFO]: Initializing CPU extension...
/home/ubuntu/anaconda3/envs/open-unmix-nnabla-gpu/lib/python3.6/importlib/ RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
return f(*args, **kwds)
/home/ubuntu/anaconda3/envs/open-unmix-nnabla-gpu/lib/python3.6/importlib/ RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
return f(*args, **kwds)
2021-07-27 23:18:57,502 [nnabla][INFO]: Initializing CUDA extension...
2021-07-27 23:18:57,559 [nnabla][INFO]: Initializing cuDNN extension...
value error in query
Failed it != items_.end(): Any of [cudnn:float, cuda:float, cpu:float] could not be found in []

No communicator found. Running with a single process. If you run this with MPI processes, all processes will perform totally same.
2021-07-27 23:18:57,560 [nnabla][INFO]: [Communicator] Using gpu_id = 0 as rank = 0
Mixing coef. is 10.0, i.e., MDL = 10.0*TD-Loss + FD-Loss
2021-07-27 23:18:57,561 [nnabla][INFO]: DataSource with shuffle(True)
2021-07-27 23:18:59,025 [nnabla][INFO]: DataSource with shuffle(False)
2021-07-27 23:18:59,289 [nnabla][INFO]: Using DataIterator
2021-07-27 23:18:59,289 [nnabla][INFO]: Using DataIterator
max_iter 320
Compute dataset statistics: 100%|███████████████| 80/80 [00:12<00:00, 8.11it/s]
0%| | 0/1000 [00:00<?, ?it/s]

The GPU memory and power is being used but nothing seems to be happening, please can anyone help me?

Large delay during inference

I'm running the nvcnet model after training and during inference there is a large time delay. The first inference is always large being around 3 seconds. All others have delays of about 1.2 seconds. The length of the wav file input doesn't change the delay.

After looking deeper it appears the delay is caused by the model construction. Is there a way to create the model object once and just do inference?

The delay makes the numbers posted in the paper to be false as you won't get the fast inference times that are published in the paper with these delays.

Any thoughts? I am still in the process of learning nnabla so maybe I am missing something about the library.

Best regards,

D3Net: inference on CPU

Separation of one source (vocals) from 3-minute track using D3Net takes ~2.5 hours on machine with 4 cores. Is there a way to speed up inference on CPU?

[Quantized Depth Completion] Questions about implementation details

First of all, thanks for the great work, but the source code is still missing.
Could you share the training/evaluating code and pretrained weights about this work?

Also, I'm trying to reimplement with PyTorch and I have some questions about the paper:

  1. How to compute the surface normal from ground truth depth in NYU Depth v2? The paper only shows the approximation for training but not the accurate one?!
  2. The dot pattern to produce sparse depth in NYU Depth v2 is unknown. Can you share the example to reproduce?
  3. The kernel_size of MaxPooling2D is missing
  4. The kernel_size, filter_size of Conv2d in Upsampling layer is missing

Thanks! Look forward to your kind reply.

outputs not saved

the output wav files are not saved on disk anywhere, i tried multiple combinations, without the output argument, with the output argument, etc, none seem to work.

Segmentation fault and RuntimeError: value error in setup_impl

I am trying to train using the X-UMX model on Google Colab, getting stuck on this error. Kindly help.

2022-07-02 10:30:45,285 [nnabla][INFO]: Initializing CPU extension...
2022-07-02 10:30:45,626 [root][INFO]: Generating grammar tables from /usr/lib/python3.7/lib2to3/Grammar.txt
2022-07-02 10:30:45,643 [root][INFO]: Generating grammar tables from /usr/lib/python3.7/lib2to3/PatternGrammar.txt
/usr/lib/python3.7/importlib/ RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
  return f(*args, **kwds)
/usr/lib/python3.7/importlib/ RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
  return f(*args, **kwds)
2022-07-02 10:30:46,093 [nnabla][INFO]: Initializing CUDA extension...
2022-07-02 10:30:46,111 [nnabla][INFO]: Initializing cuDNN extension...
2022-07-02 10:30:46,558 [nnabla][INFO]: [Communicator] Using gpu_id = 0 as rank = 0
2022-07-02 10:30:46,610 [nnabla][INFO]: DataSource with shuffle(True)
Finished loading dataset with 86 tracks.
2022-07-02 10:30:52,995 [nnabla][INFO]: DataSource with shuffle(False)
Finished loading dataset with 14 tracks.
2022-07-02 10:30:54,130 [nnabla][INFO]: Using DataIterator
2022-07-02 10:30:54,131 [nnabla][INFO]: Using DataIterator
Compute dataset statistics: 100% 86/86 [01:28<00:00,  1.03s/it]
Traceback (most recent call last):
  File "", line 207, in <module>
  File "", line 113, in train
    model = get_model(args, scaler_mean, scaler_std, max_bin=max_bin)
  File "/content/ai-research-code/x-umx/", line 431, in get_model
    mix_spec, m_hat, pred = unmix(mixture_audio)
  File "/content/ai-research-code/x-umx/", line 327, in __call__
    self.n_fft, window_type='hanning', center=True)
  File "/usr/local/lib/python3.7/dist-packages/nnabla/", line 1101, in istft
    return istft_base(y_r, y_i, window_size, stride, fft_size, window_type, center, pad_mode, as_stft_backward)
  File "<istft>", line 3, in istft
  File "/usr/local/lib/python3.7/dist-packages/nnabla/", line 4926, in istft
    return F.ISTFT(ctx, window_size, stride, fft_size, window_type, center, pad_mode, as_stft_backward)(y_r, y_i, n_outputs=n_outputs, auto_forward=get_auto_forward(), outputs=outputs)
  File "function.pyx", line 328, in nnabla.function.Function.__call__
  File "function.pyx", line 306, in nnabla.function.Function._cg_call
RuntimeError: value error in setup_impl
Failed `this->pad_mode_ == "constant"`: `pad_mode` should be "constant" for the normal use of ISTFT (`as_stft_backward == false`) since `pad_mode` is ignored and makes no effects in that case.

[b506019edf61:01753] *** Process received signal ***
[b506019edf61:01753] Signal: Segmentation fault (11)
[b506019edf61:01753] Signal code: Address not mapped (1)
[b506019edf61:01753] Failing at address: 0x7f0c3314f20d
[b506019edf61:01753] [ 0] /lib/x86_64-linux-gnu/[0x7f0c35bf4980]
[b506019edf61:01753] [ 1] /lib/x86_64-linux-gnu/[0x7f0c35833775]
[b506019edf61:01753] [ 2] /usr/lib/x86_64-linux-gnu/[0x7f0c3609ee44]
[b506019edf61:01753] [ 3] /lib/x86_64-linux-gnu/[0x7f0c35834605]
[b506019edf61:01753] [ 4] /usr/lib/x86_64-linux-gnu/[0x7f0c3609ccb3]
[b506019edf61:01753] *** 
End of error message ***

Screenshot from 2022-07-02 16-13-40

Steps followed:
!pip install musdb norbert pydub
!pip install nnabla
!pip install nnabla-ext-cuda110-nccl2-mpi3-1-6
!pip uninstall urllib3 -y
!pip uninstall folium -y
!pip install folium==0.2.1
!pip install urllib3==1.25.*

!git clone
%cd ai-research-code/x-umx
!mkdir models
!wget -P models

!python --root /content/drive/MyDrive/dataset --output /content/drive/MyDrive/crossnet/ --is-wav --epochs 10 --lr 0.001

Chinese supported?

Thank you for opensourcing the code. I'd like to know is mandarin Chinese supported in this project?

【NVC-Net】How many epochs will the model converge?

e.g. For the VTCK dataset

Besides, have you tested whether the model is robust with noisy source files (e.g. recorded by mobile phone, with background of air conditioning, or heavy breathing, which is quite common in real life application) at inference time?

Thank you very much

【NVC-net】Failed `it != items_.end()`: Any of [cudnn:float, cuda:float, cpu:float] could not be found in []

Hi. I tried to train NVC-net, but the following error occurs:

2021-11-26 03:23:06,638 [nnabla][INFO]: Initializing CPU extension...
2021-11-26 03:23:06,997 [nnabla][INFO]: Initializing CUDA extension...
2021-11-26 03:23:09,117 [nnabla][INFO]: Initializing cuDNN extension...
value error in query
Failed  it != items_.end() : Any of [cudnn:float, cuda:float, cpu:float] could not be found in []

No communicator found. Running with a single process. If you run this with MPI processes, all processes will perform totally same.
2021-11-26 03:23:09,406 [nnabla][INFO]: Training data with 103 speakers.
2021-11-26 03:23:09,407 [nnabla][INFO]: DataSource with shuffle(True)
2021-11-26 03:23:09,464 [nnabla][INFO]: Using DataIterator
Running epoch=1 lr=0.00010
Failed to allocate. Freeing memory cache and retrying.
Failed to allocate. Freeing memory cache and retrying.
Failed to allocate again.
Error during forward propagation:

Environment: Tesla T4, Cuda 10.2, Cudnn 8.1, Ubuntu 18.04.4 LTS.

I installed nnabla with pip install nnabla-ext-cuda102.
Besides, if I want to train the model with only one GPU, is python3 the right command?

NVCnet g_loss_con=0.0000 while training

i met two problems
first one is the same as the issue"" mentioned before
i tried to use docker environment that suggested in that issue
`docker pull nnabla/nnabla-ext-cuda-multi-gpu:py37-cuda110-mpi3.1.6-v1.29.0
docker run --rm -it -u $(id -u):$(id -g) --gpus all nnabla/nnabla-ext-cuda-multi-gpu:py37-cuda110-mpi3.1.6-v1.29.0

mpirun -n 2 python3 -c "import nnabla_ext.cudnn; from nnabla.ext_utils import get_extension_context; import nnabla.communicators as C; ctx = get_extension_context('cudnn', device_id='0'); C.MultiProcessDataParallelCommunicator(ctx)"and it went well ![image]( but then i tried to run the and set the batchsize=8 and the error like thiswzy@2f0a2b4b4485:~/NVCnet$ mpirun -n 1 python -c cudnn -d 7 --output_path log/baseline-wzy/ --batch_size 8
2022-11-14 04:17:10,938 [nnabla][INFO]: Initializing CPU extension...
2022-11-14 04:17:11,300 [nnabla][INFO]: Initializing CUDA extension...
2022-11-14 04:17:20,009 [nnabla][INFO]: Initializing cuDNN extension...
2022-11-14 04:17:20,359 [nnabla][INFO]: Training data with 103 speakers.
2022-11-14 04:17:20,360 [nnabla][INFO]: DataSource with shuffle(True)
2022-11-14 04:17:20,371 [nnabla][INFO]: Using DataIterator
Running epoch=1 lr=0.00010
[ 0/4689] d_loss 4.1589 (4.1589) g_loss_avd 2.0793 (2.0793) g_loss_con 0.0000 (0.0000) g_loss_rec 58.3829 (58.3829) g_loss_kld 0.0000 (0.0000)
Failed to allocate. Freeing memory cache and retrying.
Failed to allocate. Freeing memory cache and retrying.
Failed to allocate. Freeing memory cache and retrying.
Failed to allocate. Freeing memory cache and retrying.
Failed to allocate. Freeing memory cache and retrying.
Failed to allocate. Freeing memory cache and retrying.
Failed to allocate. Freeing memory cache and retrying.
Failed to allocate. Freeing memory cache and retrying.
Failed to allocate again.
Error during backward propagation:
SliceCuda <-- ERROR
Traceback (most recent call last):
File "", line 100, in
File "", line 70, in run
Trainer(gen, gen_optim, dis, dis_optim, dataloader, rng, hp).run()
File "/home/wzy/NVCnet/", line 156, in run
File "/home/wzy/NVCnet/", line 196, in train_on_batch
File "_variable.pyx", line 827, in nnabla._variable.Variable.backward
RuntimeError: memory error in alloc
Failed this->alloc_impl(): N4nbla10CudaMemoryE allocation failed.


when i changed the batchsize to 7 or 6 or 2,the glosscon is 0 all the time

this is the result of nvidia-msi

plz,enlighten me

Potential bug in xumx

I'm trying to run xumx (through, but it seems to be the same code).

In this line:

audio_nn = nn.Variable.from_numpy_array(audio.T[None, ...])

The function documentation says audio should be in the shape:

    audio: np.ndarray [shape=(nb_samples, nb_channels, nb_timesteps)]
        mixture audio

However, the ndarray is converted to a nnabla object with a transpose operator and a new dimension added:

    audio_nn = nn.Variable.from_numpy_array(audio.T[None, ...])

So when I pass audio like so:

x.shape: (1, 2, 9265664)

the audio_nn line results in this:

audio_nn: (1, 9265664, 2, 1)

Later on this fails in the STFT step from the __call__ function of the model:

nb_samples, nb_channels, _ = x.shape

It's better to have:

    audio_nn = nn.Variable.from_numpy_array(audio)

This giving the following errors below

C:\Users\Vinicius111\Downloads\xumx-master\xumx-master\xumx>python --input inputs/"C:\Users\Vinicius111\Music\Test.mp3"python ----model model/C:\Users\Vinicius111\Downloads\x-umx.h5 --outdir outputs/
2021-01-30 12:04:32,295 [nnabla][INFO]: Initializing CPU extension...
Traceback (most recent call last):
File "", line 28, in
from .args import get_inference_args
ModuleNotFoundError: No module named 'main.args'; 'main' is not a package

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.