Zhikang Niu's Projects
AcademiCodec: An Open Source Audio Codec Model for Academic Research
:hammer:AI 方向好用的科研工具
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
:fire: ASR教程: https://dataxujing.github.io/ASR-paper/
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
AudioLDM training, finetuning, evaluation and inference.
A curated list of Artificial Intelligence Top Tools
📚 A collection of resources and papers on Vector Quantized Variational Autoencoder (VQ-VAE) and its application
An implementation of the BERT model and its related downstream tasks based on the PyTorch framework
Official PyTorch implementation of BigVGAN (ICLR 2023)
The official implementation of Achieving Cross Modal Generalization with Multimodal Unified Representation (NeurIPS '23)
VQVAEs, GumbelSoftmaxes and friends
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
VAE GAN modified from Descript Audio Codec, which replaces the RVQ with VAE
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
This is a seed project for distributed PyTorch training, which was built to customize your network quickly
unofficial implementation of the High Fidelity Neural Audio Compression
A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Locally Hierarchical Auto-Regressive Modeling for Image Generation (HQ-Transformer)