flywangsir Goto Github PK
Type: User
Type: User
huggingface mirror download
speech enhancement\speech seperation\sound source localization
Open Source Image and Video Restoration Toolbox for Super-resolution, Denoise, Deblurring, etc. Currently, it includes EDSR, RCAN, SRResNet, SRGAN, ESRGAN, EDVR, BasicVSR, SwinIR, ECBSR, etc. Also support StyleGAN2, DFDNet.
vits2 backbone with multilingual-bert
Chat with any character you like: ChatGLM2+SadTalker+Voice Cloning | 和喜欢的角色沉浸式对话吧:ChatGLM2+声音克隆+视频对话
The most powerful and modular stable diffusion GUI with a graph/nodes interface.
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
:fire: 2D and 3D Face alignment library build using pytorch
Using modified BiSeNet for face parsing in PyTorch
Next generation face swapper and enhancer
FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset (CVPR2022)
GeneFace: Generalized and High-Fidelity 3D Talking Face Synthesis; ICLR 2023; Official code
the dataset and code for "Flow-guided One-shot Talking Face Generation with a High-resolution Audio-visual Dataset"
Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation (SIGGRAPH Asia 2021)
Colab adaptation of MVSep Model for MDX23 music separation contest
A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)
ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting (NeurIPS 2023 Spotlight)
本项目基于SadTalkers实现视频唇形合成的Wav2lip。通过以视频文件方式进行语音驱动生成唇形,设置面部区域可配置的增强方式进行合成唇形(人脸)区域画面增强,提高生成唇形的清晰度。使用DAIN 插帧的DL算法对生成视频进行补帧,补充帧间合成唇形的动作过渡,使合成的唇形更为流畅、真实以及自然。
Speech-to-text and text-to-speech using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift
SoftVC VITS Singing Voice Conversion
SOME: Singing-Oriented MIDI Extractor.
TaiSu(太素)--a large-scale Chinese multimodal dataset(亿级大规模中文视觉语言预训练数据集)
VITS2 for Chinese speech | 最新VITS2中文语音合成
Extension of Wav2Lip repository for processing high-quality videos.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.