eloimoliner / denoising-historical-recordings Goto Github PK
View Code? Open in Web Editor NEWA two-stage U-Net for high-fidelity denoising of historical recordings
License: MIT License
A two-stage U-Net for high-fidelity denoising of historical recordings
License: MIT License
It looks nice, I am trying to install with pip:
pip install hydra-core tensorflow[and-cuda] soundfile tqdm scipy
python inference.py inference.audio="r:\9.mp3"
it works with the CPU (slow) although I have a GPU and tensorflow[and-cuda] is installed. What am I missing?
Or, is the GPU stuff training-only?
Hello,
The readme says:
"You will need at least python 3.7 and CUDA 10.1 if you want to use GPU."
Unfortunately, my first attempt to run it in Windows without CUDA-supporting VGA failed.
There is really no separate environment file for CPU-only?
Is it possible to make it work without massive changes to the code?
Hi,
when running
python -m pip install tensorflow==2.3.0 as indicated in your requirements file, I get
ERROR: Could not find a version that satisfies the requirement tensorflow==2.3.0 (from versions: 2.5.0rc0, 2.5.0rc1, 2.5.0rc2, 2.5.0rc3, 2.5.0, 2.5.1, 2.5.2, 2.6.0rc0, 2.6.0rc1, 2.6.0rc2, 2.6.0, 2.6.1, 2.6.2, 2.7.0rc0, 2.7.0rc1, 2.7.0, 2.8.0rc0)
ERROR: No matching distribution found for tensorflow==2.3.0
It seems this version isn't even supported by pip anymore.
Upgrade to 2.5.0?
The same is true for scipy==1.4.1. Not sure about which version to take there.
If I would like to do a denoising task, where I've clean signals (in the "clean" folder) and noisy signals (in the "noise" folder).
Hi,
could you leave some hints about how to install this without conda?
Your readme appears to be very much specified to this one case.
Also it seems that you develop under linux so you use bash to execute. Maybe here a hint for win- users would be cool too.
I am just trying to get this to run under windows and so far had no success. I will update if I get further.
All the best!
load_weights no longer works. says that it could not load the model weights. Is it possible to release a checkpoint compatible with the newer versions of terraform that come with the Collab?
I am working in colab, I tried to upload an mp3 file and a WAV file and I got in the mp3 file:
LibsndfileError Traceback (most recent call last)
in <cell line: 4>()
5 print('Denoising uploaded file "{name}"'.format(
6 name=fn))
----> 7 denoise_data=denoise_audio(fn)
8 basename=os.path.splitext(fn)[0]
9 wav_output_name=basename+"_denoised"+".wav"
3 frames
in denoise_audio(audio)
69 def denoise_audio(audio):
70
---> 71 data, samplerate = sf.read(audio)
72 print(data.dtype)
73 #Stereo to mono
/usr/local/lib/python3.10/dist-packages/soundfile.py in read(file, frames, start, stop, dtype, always_2d, fill_value, out, samplerate, channels, format, subtype, endian, closefd)
283
284 """
--> 285 with SoundFile(file, 'r', samplerate, channels,
286 subtype, endian, format, closefd) as f:
287 frames = f._prepare_read(start, stop, frames)
/usr/local/lib/python3.10/dist-packages/soundfile.py in init(self, file, mode, samplerate, channels, subtype, endian, format, closefd)
656 self._info = _create_info_struct(file, mode, samplerate, channels,
657 format, subtype, endian)
--> 658 self._file = self._open(file, mode_int, closefd)
659 if set(mode).issuperset('r+') and self.seekable():
660 # Move write position to 0 (like in Python file objects)
/usr/local/lib/python3.10/dist-packages/soundfile.py in _open(self, file, mode_int, closefd)
1214 # get the actual error code
1215 err = _snd.sf_error(file_ptr)
-> 1216 raise LibsndfileError(err, prefix="Error opening {0!r}: ".format(self.name))
1217 if mode_int == _snd.SFM_WRITE:
and in the WAV another file related to file not being opened,
I am just tunning the cells as they are.
Hello, I'm not quite sure on the resource requirements for this, but training seems to be completely unfeasible on lower to lower-mid-end systems. With the exception of Google Colab, all the other setups I've tried to train on seem to run out of VRAM. Colab Free seems to run out of physical RAM, on the other hand.
The environment is the exact same as the one in environment.yml
. Noise data is the same as the one cited in the paper. Clean music data is a custom dataset, changing which did not seem to have any effect. The metadata for the noise dataset seemed to have some discrepancies, but those are a non-issue since it's a trivial cleanup.
The GPU memory usage seems to shoot up sharply at a point regardless of the hyperparameters and the size of the training data.
Tested on:
Setup 1 (Local):
Setup 2 (Azure ML Compute):
Setup 3 (Google Colab Free):
Things that have been tried:
with tf.device("CPU")
(This seems to keep the GPU usage low up to a certain point, but fails ultimately)Things that work:
Traceback:
2023-04-23 05:53:46.684363: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at mirror_pad_op.cc:120 : Resource exhausted: OOM when allocating tensor with shape[2,96,429,1027] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Training epoch 0: 0%| | 0/2000 [03:57<?, ?it/s]
[2023-04-23 05:53:46,687][__main__][ERROR] - Some error happened
Traceback (most recent call last):
File "train.py", line 159, in main
_main(args)
File "train.py", line 153, in _main
run(args)
File "train.py", line 113, in run
step_loss=trainer.distributed_training_step(iterator.get_next())
File "/home/REDACTED/anaconda3/envs/historical_denoiser/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 780, in __call__
result = self._call(*args, **kwds)
File "/home/REDACTED/anaconda3/envs/historical_denoiser/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 840, in _call
return self._stateless_fn(*args, **kwds)
File "/home/REDACTED/anaconda3/envs/historical_denoiser/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 2829, in __call__
return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
File "/home/REDACTED/anaconda3/envs/historical_denoiser/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1848, in _filtered_call
cancellation_manager=cancellation_manager)
File "/home/REDACTED/anaconda3/envs/historical_denoiser/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1924, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "/home/REDACTED/anaconda3/envs/historical_denoiser/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 550, in call
ctx=ctx)
File "/home/REDACTED/anaconda3/envs/historical_denoiser/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[2,96,429,1027] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node functional_1/multi_stage_denoise/StatefulPartitionedCall/decoder/d__block/i__block_7/dense_block_7/MirrorPad_1}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[Op:__inference_distributed_training_step_25278]
Function call stack:
distributed_training_step
For this work, I have a problem hoping you can answer——The proposed method does not seem to be specialized for music denoising, it has no characteristics unique to music recordings.
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.