dongsig Goto Github PK

followers: 3.0 following: 13.0 repos: 108.0 gists: 0.0

Name: dyang

Type: User

Company: Tencent

Bio: Speech

Location: Shanghai

dyang's Projects

advisor

Open-source implementation of Google Vizier for hyper parameters tuning

air-asvspoof

Implementation of the paper "One-class Learning Towards Synthetic Voice Spoofing Detection"

alice

Automatic LInguistic Unit Count Estimator (ALICE)

asv-anti-spoofing-dada

Dual-Adversarial Domain Adaptation for replay spoofing detection in automatic speaker verification.

asv-anti-spoofing-with-res2net

Implementation of the paper: Replay and Synthetic Speech Detection with Res2Net architecture https://arxiv.org/abs/2010.15006

asvspoof2019

Our submission to the ASVspoof 2019: Automatic Speaker Verification Spoofing and Countermeasures Challenge

audio-classification-models

Audio classification is a popular topic, here I implement several models using TenserFlow and Keras.

audioage

Transferring audio features to build models for rare conditions with scarce data

audiofile

A simple header-only C++ library for reading and writing audio files.

audiogpt

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

augly

A data augmentations library for audio, image, text, and video.

auto-age-labeler

A web application that uses artificial intelligence to automatically label voice datasets with the age of the speaker.

autogluon

AutoGluon: AutoML Toolkit for Deep Learning

auxiva-iss-dnn

Code to reproduce the results in the paper "Surrogate Source Model Learning for Determined Source Separation"

awesome-deepfakes-materials

A curated list of awesome Deepfakes materials

baby-crying-detection-based-on-audio-and-video-fusion

This is the dataset set and code of paper which name is Research of Infant Crying Detection Method Based on Audio and Video Fusion

bayesianoptimization

A Python implementation of global optimization with gaussian processes.

bwe_fftnet

Implementation of Learning Bandwidth Expansion Using Perceptually-Motivated Loss (ICASSP 2019)

cgmm-mvdr

Implementation of the CGMM-MVDR beamforming

complex-gated-recurrent-neural-networks

Complex domain recurrent neural network gating and Stiefel-manifold optimization in TensorFlow, NeurIPS 2018

create_wsj1_2345_db

Collection of scripts to create a dataset of noisy multi-channel reverberant mixtures based on wsj1 and CHiME3 datasets.

deep-voice-conversion

Deep neural networks for voice conversion (voice style transfer) in Tensorflow

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.

detectingdeepfakes_blackhat2019

Detect audio deep fakes with bispectral analysis

drum_sound_classifier

A python module for making pandas datasets out of drum libraries, and training drum type classification models using a few different methods.

dongsig Goto Github PK

dyang's Projects

Recommend Projects

Recommend Topics

Recommend Org