flywangsir Goto Github PK

followers: 1.0 following: 27.0 repos: 33.0 gists: 0.0

Type: User

flywangsir's Projects

awesome-speech-enhancement

speech enhancement\speech seperation\sound source localization

Open Source Image and Video Restoration Toolbox for Super-resolution, Denoise, Deblurring, etc. Currently, it includes EDSR, RCAN, SRResNet, SRGAN, ESRGAN, EDVR, BasicVSR, SwinIR, ECBSR, etc. Also support StyleGAN2, DFDNet.

bert-vits2

vits2 backbone with multilingual-bert

chatglm2-voice-cloning

Chat with any character you like: ChatGLM2+SadTalker+Voice Cloning | 和喜欢的角色沉浸式对话吧：ChatGLM2+声音克隆+视频对话

comfyui11

The most powerful and modular stable diffusion GUI with a graph/nodes interface.

emotivoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

face-alignment

:fire: 2D and 3D Face alignment library build using pytorch

face-parsing.pytorch

Using modified BiSeNet for face parsing in PyTorch

facefusion

Next generation face swapper and enhancer

faceverse

FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset (CVPR2022)

geneface

GeneFace: Generalized and High-Fidelity 3D Talking Face Synthesis; ICLR 2023; Official code

hdtf

the dataset and code for "Flow-guided One-shot Talking Face Generation with a High-resolution Audio-visual Dataset"

livespeechportraits

Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation (SIGGRAPH Asia 2021)

mvsep-mdx23-colab_v2

Colab adaptation of MVSep Model for MDX23 music separation contest

myheygen

natspeech

A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)

resshift

ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting (NeurIPS 2023 Spotlight)

sadtalker-video-lip-sync

本项目基于SadTalkers实现视频唇形合成的Wav2lip。通过以视频文件方式进行语音驱动生成唇形，设置面部区域可配置的增强方式进行合成唇形（人脸）区域画面增强，提高生成唇形的清晰度。使用DAIN 插帧的DL算法对生成视频进行补帧，补充帧间合成唇形的动作过渡，使合成的唇形更为流畅、真实以及自然。

sherpa-onnx

Speech-to-text and text-to-speech using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift

so-vits-svc

SoftVC VITS Singing Voice Conversion

some

SOME: Singing-Oriented MIDI Extractor.

taisu

TaiSu（太素）--a large-scale Chinese multimodal dataset（亿级大规模中文视觉语言预训练数据集）

flywangsir Goto Github PK

flywangsir's Projects

Recommend Projects

Recommend Topics

Recommend Org