ishine's Projects
Official PyTorch implementation of paper Leveraging Uni-Modal Self Supervised Learning for Multimodal Audio-visual Speech Recognition
Reject complicated operations for incorporating lexicon for Chinese NER.
Tools for handling speech data in machine learning projects.
binaural 3D sound synthesis using HRTFs
An open source library for face detection in images. The face detection speed can reach 1000FPS.
Voice activity detection (VAD) library, based on WebRTC's VAD engine
Efficient inference of large language models.
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
libtorch mobile build script
libvits-ncnn is an ncnn implementation of the VITS library that enables cross-platform GPU-accelerated speech synthesis.🎙️💻
A micro blog oriented Chinese word segmentation system. Code for 'Micro blogs Oriented Word Segmentation System'
2019语言与智能技术竞赛-基于知识图谱的主动聊天
End-to-end spoken language identification out of the box. Rewrite in progress for first release (version 1).
The repository for IEEE CVPR 2023 (A Light Weight Model for Active Speaker Detection)
Light-LPR是一个瞄准可以在嵌入式设备、手机端和普通的x86平台上运行的车牌识别开源项目,旨在支持各种场景的车牌识别,车牌字符识别准确率超99.95%,综合识别准确率超过99%,支持目前国内所有的车牌识别,觉得好用的一定要加星哦。200星公布黄牌识别模型,400星公布新能源车牌模型。
Light-SERNet: A lightweight fully convolutional neural network for speech emotion recognition
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT
基于Pytorch和torchtext的知识图谱深度学习框架。
A Modified Version of LightSeq for Non-Autoregressive Transformer
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search
轻量级中文分词系统(Lightweight Chinese Segmentation)
Lightweight speaker anonymization [IEEE SLT2021]