Topic: quantization Goto Github
Some thing interesting about quantization
Some thing interesting about quantization
quantization,micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape
User: 666dzy666
quantization,Base pretrained models and datasets in pytorch (MNIST, SVHN, CIFAR10, CIFAR100, STL10, AlexNet, VGG16, VGG19, ResNet, Inception, SqueezeNet)
User: aaron-xichen
quantization,An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Organization: autogptq
quantization,Embedded and mobile deep learning research resources
User: csarron
quantization,PyTorch Project Specification.
Organization: deepvac
quantization,Run Mixtral-8x7B models in Colab or consumer desktops
User: dvmazur
quantization,QKeras: a quantization deep learning library for Tensorflow Keras
Organization: google
quantization,A list of high-quality (newest) AutoML works and lightweight models including 1.) Neural Architecture Search, 2.) Lightweight Structures, 3.) Model Compression, Quantization and Acceleration, 4.) Hyperparameter Optimization, 5.) Automated Feature Engineering.
User: guan-yuan
quantization,Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)
User: hiyouga
Home Page: https://arxiv.org/abs/2403.13372
quantization,A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
User: htqin
quantization,Efficient computing methods developed by Huawei Noah's Ark Lab
Organization: huawei-noah
quantization,Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
Organization: huawei-noah
quantization,🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
Organization: huggingface
Home Page: https://huggingface.co/docs/optimum/main/
quantization,A pytorch quantization backend for optimum
Organization: huggingface
quantization,Palette quantization library that powers pngquant and other PNG optimizers
Organization: imageoptim
Home Page: https://pngquant.org/lib
quantization,A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
Organization: intel
quantization,SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
Organization: intel
Home Page: https://intel.github.io/neural-compressor/
quantization,Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller
Organization: intellabs
quantization,A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks
Organization: intellabs
Home Page: https://intellabs.github.io/nlp-architect
quantization,FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
Organization: ist-daslab
quantization,⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.
User: ki6an
quantization,Lossy PNG compressor — pngquant command based on libimagequant library
User: kornelski
Home Page: https://pngquant.org
quantization,Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)
User: maknee
quantization,TinyChatEngine: On-Device LLM Inference Library
Organization: mit-han-lab
Home Page: https://mit-han-lab.github.io/TinyChatEngine/
quantization,[NeurIPS 2020] MCUNet: Tiny Deep Learning on IoT Devices; [NeurIPS 2021] MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning; [NeurIPS 2022] MCUNetV3: On-Device Training Under 256KB Memory
Organization: mit-han-lab
Home Page: https://mcunet.mit.edu
quantization,Official implementation of Half-Quadratic Quantization (HQQ)
Organization: mobiusml
Home Page: https://mobiusml.github.io/hqq_blog/
quantization,Tool for onnx->keras or onnx->tflite. Hope this tool can help you.
User: mpolaris
quantization,Sparsity-aware deep learning inference runtime for CPUs
Organization: neuralmagic
Home Page: https://neuralmagic.com/deepsparse/
quantization,OpenMMLab Model Compression Toolbox and Benchmark.
Organization: open-mmlab
Home Page: https://mmrazor.readthedocs.io/en/latest/
quantization,[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
Organization: opengvlab
quantization,Fast inference engine for Transformer models
Organization: opennmt
Home Page: https://opennmt.net/CTranslate2
quantization,PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
Organization: openppl
quantization,Neural Network Compression Framework for enhanced OpenVINO™ inference
Organization: openvinotoolkit
quantization,Train, Evaluate, Optimize, Deploy Computer Vision Models via OpenVINO™
Organization: openvinotoolkit
Home Page: https://openvinotoolkit.github.io/training_extensions/
quantization,PaddleSlim is an open-source library for deep model compression and architecture search.
Organization: paddlepaddle
Home Page: https://paddleslim.readthedocs.io/zh_CN/latest/
quantization,Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massive Transpose extrapolation problem in onnx-tensorflow (onnx-tf). I don't need a Star, but give me a pull request.
User: pinto0309
quantization,AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
Organization: quic
Home Page: https://quic.github.io/aimet-pages/index.html
quantization,Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization
User: rahulschand
Home Page: https://rahulschand.github.io/gpu_poor/
quantization,INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Organization: rwkv
quantization,A Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.
Organization: sforaidl
Home Page: https://kd-lib.readthedocs.io/
quantization,[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
Organization: squeezeailab
Home Page: https://arxiv.org/abs/2306.07629
quantization,Build, customize and control you own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHXuSJEk6
Organization: stochasticai
Home Page: https://xturing.stochastic.ai
quantization,Faster Whisper transcription with CTranslate2
Organization: systran
quantization,A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
Organization: tensorflow
Home Page: https://www.tensorflow.org/model_optimization
quantization,An Open-Source Package for Deep Learning to Hash (DeepHash)
Organization: thulab
quantization,[🔥updating ...] AI 自动量化交易机器人(完全本地部署) AI-powered Quantitative Investment Research Platform. 📃 online docs: https://ufund-me.github.io/Qbot ✨ :news: qbot-mini: https://github.com/Charmve/iQuant
Organization: ufund-me
Home Page: https://github.com/Charmve
quantization,Brevitas: neural network quantization in PyTorch
Organization: xilinx
Home Page: https://xilinx.github.io/brevitas/
quantization,Dataflow compiler for QNN inference on FPGAs
Organization: xilinx
Home Page: https://xilinx.github.io/finn
quantization,中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
User: ymcui
Home Page: https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.