yujia-yan / transkun Goto Github PK
View Code? Open in Web Editor NEWA simple yet effective Audio-to-Midi Automatic Piano Transcription system
License: MIT License
A simple yet effective Audio-to-Midi Automatic Piano Transcription system
License: MIT License
I encountered a problem while running the project. I am not sure how to set runSeed in the train function. I hope to receive help. Thank you!
def train(workerId, nWorker, filename, runSeed, args):
if num_processes == 1: train(0, 1, saved_filename)
Hello, thank you for opening source your marvelous work ~!
I am curious about the facility (how many GPUs) and the time it takes to train the model.
Hi,
I get the following error on W10 when installing transkun package:
Collecting transkun Downloading transkun-0.1.2a-py3-none-any.whl (36.7 MB) ---------------------------------------- 36.7/36.7 MB 40.9 MB/s eta 0:00:00 Discarding https://files.pythonhosted.org/packages/2a/af/db364b427a7a8acc2b7a51d0f1ef9a2ebff203add5d3a971472cde6897f7/transkun-0.1.2a-py3-none-any.whl (from https://pypi.org/simple/transkun/) (requires-python:>=3.6): Requested transkun from https://files.pythonhosted.org/packages/2a/af/db364b427a7a8acc2b7a51d0f1ef9a2ebff203add5d3a971472cde6897f7/transkun-0.1.2a-py3-none-any.whl has inconsistent version: filename has '0.1.2a0', but metadata has '0.1.2' ERROR: Could not find a version that satisfies the requirement transkun (from versions: 0.1.2a0) ERROR: No matching distribution found for transkun
I get the following error on MacOS12.2.1 when installing transkun package:
mashengtailang@mashengtailangdeMacBook-Pro ~ % transkun input.mp3 output.mid
zsh: command not found: transkun
is anyway to filter noise?
for example: some low probability events.
Hello, I have a question about the model size.
The pretrained model you gave is 40MB (https://github.com/Yujia-Yan/Skipping-The-Frame-Level/tree/main/transkun/pretrained)
But when I run the training script using this config, I got a checkpoint which is 170MB.
Hi,
Thanks for sharing your work.
Can you share your trained model checkpoint? It seems that without training, I can't reproduce your results.
Hello, thank you for the valuable code sharing!
I have several questions about the code.
The default parameter for training is different from the pre-trained model in the repo.
For the default setting, it has 229 mel bins (as same as the paper), but the pre-trained model has 300 mel bins. Also, f_min and f_max value are different. Also I found that the pre-trained model has one more conv layer in the PreConvSpec. Does this change have a meaningful change on the performance?
Also, when I tried the training (once with the default parameter, and the other with the pre-trained model parameter), both cases shows much lower performance than the pre-trained model (0.7403 for valid F1) and the score reported in the paper. I think the only difference is the batch size, which is 12 in the paper and 2 in the default parameter. Have you ever trained the model with batch size 2 or trained the model with the default parameter in this repo?
Again, thank you very much for sharing your code! ๐
Hi, I was looking for a software for transcription and I found your page in github. I've installed it and ran:
transkun mp3_file.mp3 midi_file.mid
but the only result is the text "Killed" in the screen. Any suggestion to find the issue?
Thank you!
Hi and thanks for releasing the code. The title speaks for itself. I'd like to know if there is a demonstration of a music being transcribed by that algorithm.
can you tell me the version?thanks a lot
I want to transcript the voice
The outputted MIDI files often simulates reverb in the audio recording by replaying multiple notes rapidly.
Are the developers familiar with this?
Any fixes?
Hi there. I'm reaching out to report an issue with the latest version of Transkun. Although I'm not an expert in machine learning, I've noticed that when using Transkun to transcribe audio into MIDI files, the notes appear too close together. This results in a short and abrupt sound, regardless of the input audio or MIDI synthesizer used.
As a user, I rely on Transkun to generate accurate and usable MIDI files, so I wanted to bring this problem to your attention. I greatly appreciate your efforts in developing Transkun, and I kindly request your assistance in resolving this note timing issue.
Please let me know if you need any further information from me to address this matter. Thank you for your attention, and I look forward to your prompt response.
I'm using MacOS Monterey with an M1 Chip.
[W NNPACK.cpp:53] Could not initialize NNPACK! Reason: Unsupported hardware.
/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pydub/utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.8/bin/transkun", line 8, in
sys.exit(main())
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/transkun/transcribe.py", line 57, in main
fs, audio= readAudio(audioPath)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/transkun/transcribe.py", line 11, in readAudio
audio = pydub.AudioSegment.from_mp3(path)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pydub/audio_segment.py", line 796, in from_mp3
return cls.from_file(file, 'mp3', parameters=parameters)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pydub/audio_segment.py", line 651, in from_file
file, close_file = _fd_or_path_or_tempfile(file, 'rb', tempfile=False)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pydub/utils.py", line 60, in _fd_or_path_or_tempfile
fd = open(fd, mode=mode)
FileNotFoundError: [Errno 2] No such file or directory: 'input.mp3'
Hi,
Thank you very much for this repo - I'm trying to train this model from scratch on some Saxophone recordings.
Firstly, I was getting weird errors for
It might be worth mentioning these in the README for people who want to train on something other than Maestro.
The error I'm now encountering is during the first epoch
epoch:0 progress:0.000 step:0 loss:5907.2900 gradNorm:12.11 clipValue:28.85 time:0.39
epoch:0 progress:0.000 step:0 loss:5911.5234 gradNorm:12.17 clipValue:23.27 time:0.38
Warning: detected parameter with no gradient that requires gradient:
torch.Size([90, 256])
pitchEmbedding.weight
Warning: detected parameter with no gradient that requires gradient:
torch.Size([512, 1792])
velocityPredictor.0.weight
Warning: detected parameter with no gradient that requires gradient:
torch.Size([512])
velocityPredictor.0.bias
Warning: detected parameter with no gradient that requires gradient:
torch.Size([512, 512])
velocityPredictor.3.weight
Warning: detected parameter with no gradient that requires gradient:
torch.Size([512])
velocityPredictor.3.bias
Warning: detected parameter with no gradient that requires gradient:
torch.Size([128, 512])
velocityPredictor.6.weight
Warning: detected parameter with no gradient that requires gradient:
torch.Size([128])
velocityPredictor.6.bias
Warning: detected parameter with no gradient that requires gradient:
torch.Size([512, 1792])
refinedOFPredictor.0.weight
Warning: detected parameter with no gradient that requires gradient:
torch.Size([512])
refinedOFPredictor.0.bias
Warning: detected parameter with no gradient that requires gradient:
torch.Size([128, 512])
refinedOFPredictor.3.weight
Warning: detected parameter with no gradient that requires gradient:
torch.Size([128])
refinedOFPredictor.3.bias
Warning: detected parameter with no gradient that requires gradient:
torch.Size([2, 128])
refinedOFPredictor.6.weight
Warning: detected parameter with no gradient that requires gradient:
torch.Size([2])
refinedOFPredictor.6.bias
Traceback (most recent call last):
File "/import/linux/python/3.8.2/lib/python3.8/runpy.py", line 193, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/import/linux/python/3.8.2/lib/python3.8/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/import/research_c4dm/jxr01/Skipping-The-Frame-Level/transkun/train.py", line 364, in <module>
train(0, 1, saved_filename, int(time.time()), args)
File "/import/research_c4dm/jxr01/Skipping-The-Frame-Level/transkun/train.py", line 199, in train
average_gradients(model, totalLen, parallel)
File "/import/research_c4dm/jxr01/Skipping-The-Frame-Level/transkun/TrainUtil.py", line 45, in average_gradients
param.grad.data /= c
AttributeError: 'NoneType' object has no attribute 'data'
It looks like many of the parameters don't have their gradients initialised. This is strange because at this point in the run it has completed a backward pass so I thought all the gradients should have been set. I'm using the following settings to train:
python3 -m transkun.train --nProcess 1 --batchSize 1 --hopSize 5 --chunkSize 10 --datasetPath "/import/research_c4dm/jxr01/bytedance_piano_transcription/filosax_train/" --datasetMetaFile_train "filosax_data/train.pickle" --datasetMetaFile_val "filosax_data/val.pickle" --augment checkpoint/filosax_model
Can you give me any tips on what to try next?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.