entn-at Goto Github PK
Name: Ewald Enzinger
Type: User
Bio: Ph.D. EE (UNSW Sydney). ML, speaker recognition, speech recognition, speech synthesis, forensic voice comparison
Twitter: entn_at
Location: Portland, Oregon
Blog: https://entn.at/
Name: Ewald Enzinger
Type: User
Bio: Ph.D. EE (UNSW Sydney). ML, speaker recognition, speech recognition, speech synthesis, forensic voice comparison
Twitter: entn_at
Location: Portland, Oregon
Blog: https://entn.at/
Simple torch.nn.module implementation of Alias-Free-GAN style filter and resample
Automatic LInguistic Unit Count Estimator (ALICE)
HMM, CTC, RNN-Transducer, forward-backward algorithm
Pretrained model for ICASSP 2020 "Universal Phone Recognition with a Multilingual Allophone System"
Smart Language Model
Real-Time and Accurate Multi-Person Pose Estimation&Tracking System
The Additive Margin MobileNet1D is a new light weight deep learning model for Speaker Recognition which is based on the MobileNetV2 architecture and the Additive Margin Softmax (AM-Softmax) loss function.)
American English Pronunciation Dictionary
Package containing the tools necessary for decomposing a speech signal into its modulated components (also known as AM-FM decomposition). Includes the algorithms of the QHM family and the YAAPT pitch tracker.
AMR Eager trained on English, German, Italian, Spanish and Chinese
The code repository for AMR guided joint information extraction model (NAACL-2021).
This is the implementation of an unpublished paper: Adversarial Multi-task deep feature and unsupervised back-end adpatation for language recognition
Angular penalty loss functions in Pytorch (ArcFace, SphereFace, Additive Margin, CosFace)
:speech_balloon: Speech recognition for your site
A toolkit for reproducible information retrieval research built on Lucene
3 neural network spoofed speech countermeasures for detecting whether a speech sample is replayed or genuine.
End to end Arabic TTS system based on tacotron
Pronounce Arabic words
Autoregressive probabilistic modelling for speech synthesis.
The CALLHOME Egyptian Arabic Speech Translation Corpus
it's ASR decoder and make graph project
Interactive speech recognition demo using the local microphone
This is an extension of kaldi speech recognition software which allows to perform decoding of speech with hybrid word and phoneme graphs. The output is a mix of in-vocabulary words and phoneme sequences. This decoding is suitable for systems with only a small dictionary available and for further recovery of OOV words.
基于kaldi的ios本地语音识别(本地)
24-hour Automatic Speech Recognition
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.