Git Product home page Git Product logo

msanii's Introduction

Msanii: High Fidelity Music Synthesis on a Shoestring Budget

arXiv Hugging Face Spaces Open In Colab GitHub Repo stars

A novel diffusion-based model for synthesizing long-context, high-fidelity music efficiently.

Abstract

In this paper, we present Msanii, a novel diffusion-based model for synthesizing long-context, high-fidelity music efficiently. Our model combines the expressiveness of mel spectrograms, the generative capabilities of diffusion models, and the vocoding capabilities of neural vocoders. We demonstrate the effectiveness of Msanii by synthesizing tens of seconds (190 seconds) of stereo music at high sample rates (44.1 kHz) without the use of concatenative synthesis, cascading architectures, or compression techniques. To the best of our knowledge, this is the first work to successfully employ a diffusion-based model for synthesizing such long music samples at high sample rates. Our demo can be found here and our code here.

Disclaimer

This is a work in progress and has not been finalized. The results and approach presented are subject to change and should not be considered final.

Samples

See more here.

Midnight Melodies Echoes of Yesterday
 Midnight Melodies  Echoes of Yesterday
Rainy Day Reflections Starlight Sonatas
 Rainy Day Reflections  Starlight Sonatas

Setup

Setup your virtual environment using conda or venv.

Install package from git

    pip install -q git+https://github.com/Kinyugo/msanii.git

Install package in edit mode

    git clone https://github.com/Kinyugo/msanii.git
    cd msanii
    pip install -q -r requirements.txt
    pip install -e .

Training

Notebook

Open In Colab

CLI

To train via CLI you need to define a config file. Check for sample config files within the conf directory.

    wandb login
    python -m msanii.scripts.training <path-to-your-config.yml-file>

Inference

Notebook

Open In Colab

CLI

Msanii supports the following inference tasks:

  • sampling
  • audio2audio
  • interpolation
  • inpainting
  • outpainting

Each task requires a different config file. Check conf directory for samples.

    gdown 1G9kF0r5vxYXPSdSuv4t3GR-sBO8xGFCe # model checkpoint
    python -m msanii.scripts.inference <task> <path-to-your-config.yml-file>

Demo

HF Spaces & Notebook

Hugging Face Spaces Open In Colab

CLI

To run the demo via CLI you need to define a config file. Check for sample config files within the conf directory.

    gdown 1G9kF0r5vxYXPSdSuv4t3GR-sBO8xGFCe # model checkpoint
    python -m msanii.demo.demo <path-to-your-config.yml-file>

Contribute to the Project

We are always looking for ways to improve and expand our project, and we welcome contributions from the community. Here are a few ways you can get involved:

  • Bug Fixes and Feature Requests: If you find any issues with the project, please open a GitHub issue or submit a pull request with a fix.
  • Data Collection: We are always in need of more data to improve the performance of our models. If you have any relevant data that you would like to share, please let us know.
  • Feedback: We value feedback from our users and would love to hear your thoughts on the project. Please feel free to reach out to us with any suggestions or comments.
  • Funding: If you find our project helpful, consider supporting us through GitHub Sponsors. Your support will help us continue to maintain and improve the project.
  • Computational Resources: If you have access to computational resources such as GPU clusters, you can help us by providing access to these resources to run experiments and improve the project.
  • Code Contributions: If you are a developer and want to contribute to the codebase, feel free to open a pull request.
  • Documentation: If you have experience with documentation and want to help improve the project's documentation please let us know.
  • Promotion: Help increase the visibility and attract more contributors by sharing the project with your friends, colleagues, and on social media.
  • Educational Material: If you are an educator or content creator you can help by creating tutorials, guides or educational material that can help others understand the project better.
  • Discussing and Sharing Ideas: Even if you don't have the time or technical skills to contribute directly to the code or documentation, you can still help by sharing and discussing ideas with the community. This can help identify new features or use cases, or find ways to improve existing ones.
  • Ethical Review: Help us ensure that the project follows ethical standards by reviewing data and models for potential infringements. Additionally, please do not use the project or its models to train or generate copyrighted works without proper authorization.

msanii's People

Contributors

juanalonso avatar kinyugo avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.