Git Product home page Git Product logo

lass's Introduction

Language-Queried Audio Source Separation

This repository contains the code and models of "Separate What You Describe: Language-Queried Audio Source Separation" [INTERSPEECH 2022].

Check the examples and presentation video in the Demo Page!

Setup

Clone the repository and setup the conda environment:

git clone https://github.com/liuxubo717/LASS.git && \
cd LASS && \ 
conda env create -f environment.yml && \
conda activate LASS 

Inference

For running the inference of the pre-trained LASS-Net model, please download our pre-trained checkpoint and put it under ckpt/. We prepare ten audio mixtures (in examples/) with text queries (as illustrated in the Demo Page) for a toy inference running.

Run inference with AudioCaps text queries:

python inference.py -q AudioCaps

Or run inference with our collected human annotations:

python inference.py -q Human

The separated audio clips will be automatically saved in output/.

Dataset Recipe

Due to the copyright of AudioSet we cannot release the raw data. The training and evaluation indexes in this work are available at dataset/. For faciliating the reproduction and comparsion, we release our code of creating audio mixtures at utils/create_mixtures.py. Here is an example of usage:

from utils.create_mixtures import add_noise_and_scale
wav1 = torch.randn(1, 32000)
wav2 = torch.randn(1, 32000)
target, noise, snr, scale = add_noise_and_scale(wav1, wav2)

Updates

  • Provide conda-pack envs
  • Inference code and model release
  • Dataset release
  • Training code release

Citation

@inproceedings{liu2022separate,
  title={Separate What You Describe: Language-Queried Audio Source Separation},
  author={Liu, Xubo and Liu, Haohe and Kong, Qiuqiang and Mei, Xinhao and Zhao, Jinzheng and Huang, Qiushi and Plumbley, Mark D and Wang, Wenwu},
  booktitle = {INTERSPEEH},
  year = {2022}
}

lass's People

Contributors

liuxubo717 avatar jqueguiner avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.