Git Product home page Git Product logo

fcpe's Introduction

TorchFCPE

Overview

TorchFCPE is a PyTorch-based library designed for audio pitch extraction and MIDI conversion. This README provides a quick guide on how to use the library for audio pitch inference and MIDI extraction.

Installation

Before using the library, make sure you have the necessary dependencies installed:

pip install torch librosa torchfcpe

Usage

1. Audio Pitch Inference

from torchfcpe import spawn_bundled_infer_model
import torch
import librosa

# Configure device and target hop size
device = 'cpu'  # or 'cuda' if using a GPU
sr = 16000  # Sample rate
hop_size = 160  # Hop size for processing

# Load and preprocess audio
audio, sr = librosa.load('test.wav', sr=sr)
audio = librosa.to_mono(audio)
audio_length = len(audio)
f0_target_length = (audio_length // hop_size) + 1
audio = torch.from_numpy(audio).float().unsqueeze(0).unsqueeze(-1).to(device)

# Load the model
model = spawn_bundled_infer_model(device=device)

# Perform pitch inference
f0 = model.infer(
    audio,
    sr=sr,
    decoder_mode='local_argmax',  # Recommended mode
    threshold=0.006,  # Threshold for V/UV decision
    f0_min=80,  # Minimum pitch
    f0_max=880,  # Maximum pitch
    interp_uv=False,  # Interpolate unvoiced frames
    output_interp_target_length=f0_target_length,  # Interpolate to target length
)

print(f0)

2. MIDI Extraction

# Extract MIDI from audio
midi = model.extact_midi(
    audio,
    sr=sr,
    decoder_mode='local_argmax',  # Recommended mode
    threshold=0.006,  # Threshold for V/UV decision
    f0_min=80,  # Minimum pitch
    f0_max=880,  # Maximum pitch
    output_path="test.mid",  # Save MIDI to file
)

print(midi)

Notes

  • Inference Parameters:

    • audio: Input audio as a torch.Tensor.
    • sr: Sample rate of the audio.
    • decoder_mode (Optional): Mode for decoding, 'local_argmax' is recommended.
    • threshold (Optional): Threshold for voice/unvoiced decision; default is 0.006.
    • f0_min (Optional): Minimum pitch value; default is 80 Hz.
    • f0_max (Optional): Maximum pitch value; default is 880 Hz.
    • interp_uv (Optional): Whether to interpolate unvoiced frames; default is False.
    • output_interp_target_length (Optional): Length to which the output pitch should be interpolated.
  • MIDI Extraction Parameters:

    • audio: Input audio as a torch.Tensor.
    • sr: Sample rate of the audio.
    • decoder_mode (Optional): Mode for decoding; 'local_argmax' is recommended.
    • threshold (Optional): Threshold for voice/unvoiced decision; default is 0.006.
    • f0_min (Optional): Minimum pitch value; default is 80 Hz.
    • f0_max (Optional): Maximum pitch value; default is 880 Hz.
    • output_path (Optional): File path to save the MIDI file. If not provided, only returns the MIDI structure.
    • tempo (Optional): BPM for the MIDI file. If None, BPM is automatically predicted.

Additional Features

  • Model as a PyTorch Module: You can use the model as a standard PyTorch module. For example:

    # Change device
    model = model.to(device)
    
    # Compile model
    model = torch.compile(model)

fcpe's People

Contributors

cnchtu avatar ylzz1997 avatar qiuqiao avatar blaisewf avatar tps-f avatar soulmelody avatar narusemioshirakana avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.