yzs-lab Goto Github PK

repos: 70.0 gists: 0.0

Type: Organization

Location: China

Blog: https://yezhisheng.me

yzs-lab's Projects

llm-analysis

Latency and Memory Analysis of Transformer Models for Training and Inference

llm-viewer

Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.

lmflow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Model for All.

localstack

💻 A fully functional local AWS cloud stack. Develop and test your cloud & Serverless apps offline!

lookaheaddecoding

lorax

Serve 100s of Fine-Tuned LLMs in Production for the Cost of 1

magic_castle

Terraform modules to replicate the HPC user experience in the cloud

megatron-deepspeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

minimalloc

A lightweight memory allocator for hardware-accelerated machine learning

mkcert

A simple zero-config tool to make locally trusted development certificates with any names you'd like.

nvidia-gpu-mem-monitor

gpu的显存使用监控

nvshare

Transparent GPU Sharing Without Memory Size Constraints

open-gpu-kernel-modules

NVIDIA Linux open GPU kernel module source

open-simulator

K8s cluster simulator for workload scheduling.

openrlhf

A Ray-based High-performance RLHF framework (for 7B on RTX4090 and 34B on A100)

paleo

An analytical performance modeling tool for deep neural networks.

pytorch-opcounter

Count the MACs / FLOPs of your PyTorch model.

pytorch-summary

Model summary in PyTorch similar to `model.summary()` in Keras

pytorch_memlab

Profiling and inspecting memory in pytorch

scalene

Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals

spotinstancescore

subclass_zoo

system-design

Learn how to design systems at scale and prepare for system design interviews

tensorpipe

A tensor-aware point-to-point communication primitive for machine learning

torchdistx

Torch Distributed Experimental

torchgpipe

A GPipe implementation in PyTorch

torchinfo

View model summaries in PyTorch!

training_on_a_dime

Scripts and logs for "Analysis and Expoitation of Dynamic Pricing in the Public Cloud for ML Training", which is to appear at DISPA 2020

unsloth

2-5X faster 70% less memory QLoRA & LoRA finetuning

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

yzs-lab Goto Github PK

yzs-lab's Projects

Recommend Projects

Recommend Topics

Recommend Org