Code for the paper: "Leveraging speaker attribute information using multi task learning for speaker verification and diarization" accepted at Interspeech 2021

nemo

NeMo: a toolkit for conversational AI

passt

Efficient Training of Audio Transformers with Patchout

podcastmix

PodcastMix A dataset for separating music and speech in podcasts.

pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

pytorch-ivectors

GPU accelerated implementation of i-vector extractor training using PyTorch. Requires Kaldi for feature extraction and UBM training. An example script is provided for VoxCeleb data.

pytorch_xvectors

Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196

real-time-voice-cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

rnn_ctc

Recurrent Neural Network and Long Short Term Memory (LSTM) with Connectionist Temporal Classification implemented in Theano. Includes a Toy training example.

sasvc2022_baseline

Baseline for the Spoofing-aware Speaker Verification Challenge 2022

sid-ternary-network

Code and models to accompany "Energy Efficient SID for Low-Precision Networks"

sms-tools

Sound analysis/synthesis tools for music applications

speaker-id

This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.

speakerembeddinglosscomparison

Companion repository for the paper "A Comparison of Metric Learning Loss Functions for End-to-End Speaker Verification" published at SLSP 2020

speakerprofiling

Estimating the Age, Height, and Gender of a speaker with their speech signal.

speech-transformer

A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.

starganv2-vc

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion

tfgan-plc

A Temporal-Spectral Generative Adversarial Network based End-to-end Packet Loss Concealment for Wideband Speech Transmission

trivial-events-recognition

voxsrc2020

Development Toolkit for the VoxCeleb Speaker Recognition Challenge 2020

w2v2-speaker-few-samples

Research code for the paper "Training speaker recognition systems with limited data" at https://arxiv.org/abs/2203.14688

wespeaker

Production First and Production Ready Speaker Recognition Toolkit

aidman Goto Github PK

Bruce's Projects

Recommend Projects

Recommend Topics

Recommend Org