ishine's Projects
A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.
Code and models for evaluating a state-of-the-art lip reading network
Keras implementation of ‘’Deep Speaker: an End-to-End Neural Speaker Embedding System‘’ (speaker recognition)
This repository contains implementations and illustrative code to accompany DeepMind publications
An open source library for deep learning end-to-end dialog systems and chatbots.
Grapheme to phoneme conversion with deep learning.
Semantic Communication Systems for Speech Transmission
unofficial implementation of Deepsinger
A LSTM model using Risk Estimation loss function for stock trades in market
deepx_core是一个专注于张量计算/深度学习的基础库
Deep Xi: A Deep Learning Approach to A Priori SNR Estimation. Used for Speech Enhancement and robust ASR.
Music separation
demo-test-stt-cfm
Code for the paper Music Source Separation in the Waveform Domain
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.
A two-stage U-Net for high-fidelity denoising of historical recordings
Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index (DenSPI)
桌面字幕——实时语音识别。
Detectron2 is FAIR's next-generation platform for object detection and segmentation.
[ECCV2022] The implementation for "Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis".
Tensorflow version of DFSMN
Deep Feedforward sequential memory networks(FSMN)
keras implement of dgcnn for reading comprehension