Git Product home page Git Product logo

poly-pitch-net's Introduction

Poly Pitch Net

Install python packages

setup python venv
python3.10 -m venv venv
source venv/bin/activate
pip install amt-tools
# install guitar-transcription-continuous
pip install git+https://github.com/cwitkowitz/guitar-transcription-continuous.git@d481054f54184374c04b1cc27a487dc35c87f353
# install guitar-transcription-inhibition
pip install git+https://github.com/cwitkowitz/guitar-transcription-with-inhibition.git@e611c1dc9b7340d35c9a697d1658b3b2afb3978a

Experiments

poly_pitch_net_experiments directory contains scripts with the implementation of training, inference and valuation of the Poly Pitch Net models with the use of the GuitarSet dataset. When running the experiment.py script for the first time, make sure that the ../generated/ directory is empty - this is the cache directory for all the validation and training sets. If any errors related to missing dictionary keys occur, make sure that the reset_data flag in the GuitarSet init calls is set to True.

run the training script
python3 poly_pitch_net_experiments/experiment_fretnet_cnn.py
run the training script with nohup
nohup python3 poly_pitch_net_experiments/experiment_fretnet_cnn.py > output.txt 2>&1 &

Tests

Pytest is the framework of our choice.

pip install pytest

To run the tests, first, install the poly_pitch_net package as a developement package, run below in the root of the poly-pitch-net git project directory.

pip install -e .

Finally, run the tests

pytest -v

poly-pitch-net's People

Contributors

anthonio9 avatar

Watchers

 avatar

poly-pitch-net's Issues

Create a MonoPitchNet model prototype

Create a model class - MonoPitchNet

  • just a few CNN layers
  • do not use reshape, use stack, chunk or permute
  • 360 logits on the output, just like crepe
  • take a lot of inspiration from penn

Deliver training samples to FCNF0++ in the correct format

FCNF0++ should be build and delivered audio data correct with the below

  • 5 cents of bin width
  • 1440 pitch bins
  • 128 batch
  • 8kHz sample rate
  • 1024 samples frame size ~ 128ms
  • no normalization in FCNF0++

This requires modifications to the current dataset class, quite many things actually.

MonoPitchNet2D - multiple inputs, STFT, Time, Autocorrelation

MonoPitch1D is struggling with recognizing short-term pitch deviations. Let's try to fix this!
It seems that the approach with just CQT on the input cannot catch all the information about how the pitch is changing. My guess is that the bins are just way too wide and small pitch deviation information is gone. CQT is great for long-term pitch values, it needs to be balanced off with something else for the short-term, perhaps a short time STFT, pure audio in time domain or autocorrelation? Enhanced Autocorrelation?

Scenarios to test:

  • CQT + TIME
  • STFT + TIME
  • CQT + TIME + AC
  • CQT + TIME + EAC
  • STFT + TIME + AC
  • STFT + TIME + EAC
  • HCQT
  • HCQT + TIME
  • HCQT + TIME + EAC

Quite many scenarios, ideed.

MonoPitchNet: add a class for silence

Currently MonoPitchNet model outputs a one-hot vector of size 360, add one more class to make it aware of silence, in total 361 classes. This should enhance the overall quality of the model.

Make new conversion functions that will remember that 1 in the silence class means SILENCE, 1 in the pitch class means PITCH. When converting to a pitch array put 0 where there is silence to indicate NO pitch state, even if the array is in cents.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.