Git Product home page Git Product logo

mc-fp's People

Contributors

wwerkk avatar

Stargazers

 avatar

Watchers

 avatar  avatar

mc-fp's Issues

generated token sequence - alternating buffer

on a M1 MacBook Pro CPU it does not seem to be possible to generate sequences longer than 32 tokens fast enough to keep up with real-time, at least with the trebles model.
alternating token buffers need to be implemented so that while one token sequence is being played through, another can be generated at the same time

online.py - generation offset

The drumloop2 model generates rotations of the sequence which tend to move by token or two with each generation, even with temperature <=1.

Offline generation works properly, as it results in ABCB pattern which corresponds exactly to the drum pattern in the data.

Might be worth to test offline generation with longer sequences to determine if the pattern shifts in that case as well.

labelled_frames dict refactor

labelled_frames dict object saves frames sample by sample which is redundant
it should be implemented to save sample index values for frame start and end only

Byte-Pair Encoding

It could help a lot to detect commonly occuring sub-sequences and replace them with a single token.

Max always picks the first frame in array

Max is now responsible for looking up the frames according to the frames.json dictionary
at the moment it always picks up the first frame in the value array of a given key

this should be picked at random instead
ie. by selecting random odd value within the range of length of array-1 as it is done in Streamer.get_frame

frame picking

could do with a few modes which would change behaviour of picking the frame of a given class from the dictionary when a grain is being triggered:

  • locked n-th frame only
  • random (as it is now)
  • sequential (from first to last frame of given class, could us a table object to save state per-token)
  • sequential (from last to first)
  • mixture between the above, ie. using a markov chain or other means of controlled randomization

poor benchmark output

sweep, drumloop models trained on very simple data generate output that makes no sense in comparison to the input audio

validation

presumably there is no validation method viable for any sort of audio data and the training metrics are very difficult to trust in cases of audio input with high variance.
it might be a good idea to implement k-fold validation, ie. as described here

timestretching

would be nice, but most likely would require refactoring synthesis to use mc.groove

polyphony

Max resynthesis should implement multiple voices, likely using mc

Stand-alone generation script

Generation could be implemented as a stand alone Python script, allowing for further extensions, ie. real-time sequence generation or communication with other environments to be used for resynthesis

trigger model (re)load via OSC

perhaps a function mapped to /m messages could deal with that?
should not be too difficult to implement, since the model name only is sufficient enough to localize the necessary directory/files

features

current extract_features function in train.py computes the following:

  • zero crossings
  • energy
  • spectral centroid
  • spectral bandwidth
  • spectral flatness
  • spectral rolloff
  • MFCCs 1-13

it is worth testing various combinations of the above as some might turn out to be redundant or worsen the performance

General refactor

Data processing, model training and sequence generation could be implemented as separate classes, as the Jupyter notebook is meant to be the end user interface for model training.

model trained on more data throws error during prediction

A model trained on dataset segmented with a small hop_length parameter (resulting in a much higher number of frames) does not work for prediction:

Prompt:  []
Generating sequence...
Temperature:
 0.0
----------------------------------------
Exception occurred during processing of request from ('127.0.0.1', 62437)
Traceback (most recent call last):
  File "/Users/wwerkowicz/miniconda/envs/cpu/lib/python3.10/socketserver.py", line 683, in process_request_thread
    self.finish_request(request, client_address)
  File "/Users/wwerkowicz/miniconda/envs/cpu/lib/python3.10/socketserver.py", line 360, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/Users/wwerkowicz/miniconda/envs/cpu/lib/python3.10/socketserver.py", line 747, in __init__
    self.handle()
  File "/Users/wwerkowicz/miniconda/envs/cpu/lib/python3.10/site-packages/pythonosc/osc_server.py", line 33, in handle
    server.dispatcher.call_handlers_for_packet(self.request[0], self.client_address)
  File "/Users/wwerkowicz/miniconda/envs/cpu/lib/python3.10/site-packages/pythonosc/dispatcher.py", line 193, in call_handlers_for_packet
    handler.invoke(client_address, timed_msg.message)
  File "/Users/wwerkowicz/miniconda/envs/cpu/lib/python3.10/site-packages/pythonosc/dispatcher.py", line 54, in invoke
    self.callback(message.address, self.args, *message)
  File "/Users/wwerkowicz/GS/MC/MC-FP/MC-FP-master/generate.py", line 126, in handle_g
    seq = generate(sequence_length=sequence_length, temperature=temperature, prompt=prompt)
  File "/Users/wwerkowicz/GS/MC/MC-FP/MC-FP-master/generate.py", line 97, in generate
    p_label = sample(preds[0], temperature)
  File "/Users/wwerkowicz/GS/MC/MC-FP/MC-FP-master/generate.py", line 71, in sample
    probas = np.random.multinomial(1, preds, 1)
  File "mtrand.pyx", line 4272, in numpy.random.mtrand.RandomState.multinomial
  File "_common.pyx", line 391, in numpy.random._common.check_array_constraint
  File "_common.pyx", line 377, in numpy.random._common._check_array_cons_bounded_0_1
ValueError: pvals < 0, pvals > 1 or pvals contains NaNs
----------------------------------------```

Path cleanup

Pathlib could be used all the way through the code to make the paths more comprehensive.
Brief user tests proved an interface as simple as possible seems like the way to go for setting file directories etc.

autoencoding?

an lstm autoencoder could be used for predicting the next step in the sequence.
this would render clustering redundant, as given predicted features, a single closest matching frame could be found using a k-d tree, similarly to audio mosaicking.
https://machinelearningmastery.com/lstm-autoencoders/

would it solve the general messiness of the model output experienced currently?
worth trying

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.