Git Product home page Git Product logo

ggml's Introduction

ggml

Tensor library for machine learning

Note that this project is under active.
Some of the development is currently happening in the llama.cpp and whisper.cpp repos

Features

  • Written in C
  • 16-bit float support
  • Integer quantization support (4-bit, 5-bit, 8-bit, etc.)
  • Automatic differentiation
  • ADAM and L-BFGS optimizers
  • Optimized for Apple Silicon
  • On x86 architectures utilizes AVX / AVX2 intrinsics
  • No third-party dependencies
  • Zero memory allocations during runtime

Roadmap

Whisper inference (example)

With ggml you can efficiently run Whisper inference on the CPU.

Memory requirements:

Model Disk Mem
tiny 75 MB ~280 MB
base 142 MB ~430 MB
small 466 MB ~1.0 GB
medium 1.5 GB ~2.6 GB
large 2.9 GB ~4.7 GB

GPT inference (example)

With ggml you can efficiently run GPT-2 and GPT-J inference on the CPU.

Here is how to run the example programs:

# Build ggml + examples
git clone https://github.com/ggerganov/ggml
cd ggml
mkdir build && cd build
cmake ..
make -j4 gpt-2 gpt-j

# Run the GPT-2 small 117M model
../examples/gpt-2/download-ggml-model.sh 117M
./bin/gpt-2 -m models/gpt-2-117M/ggml-model.bin -p "This is an example"

# Run the GPT-J 6B model (requires 12GB disk space and 16GB CPU RAM)
../examples/gpt-j/download-ggml-model.sh 6B
./bin/gpt-j -m models/gpt-j-6B/ggml-model.bin -p "This is an example"

# Run the Cerebras-GPT 111M model
# Download from: https://huggingface.co/cerebras
python3 ../examples/gpt-2/convert-cerebras-to-ggml.py /path/to/Cerebras-GPT-111M/
./bin/gpt-2 -m /path/to/Cerebras-GPT-111M/ggml-model-f16.bin -p "This is an example"

The inference speeds that I get for the different models on my 32GB MacBook M1 Pro are as follows:

Model Size Time / Token
GPT-2 117M 5 ms
GPT-2 345M 12 ms
GPT-2 774M 23 ms
GPT-2 1558M 42 ms
--- --- ---
GPT-J 6B 125 ms

For more information, checkout the corresponding programs in the examples folder.

Using cuBLAS

# fix the path to point to your CUDA compiler
cmake -DGGML_CUBLAS=ON -DCMAKE_CUDA_COMPILER=/usr/local/cuda-12.1/bin/nvcc ..

Using clBLAST

cmake -DGGML_CLBLAST=ON ..

Resources

ggml's People

Contributors

ggerganov avatar rgerganov avatar marella avatar danforbes avatar lostruins avatar mverrilli avatar nouamanetazi avatar jaeminson avatar katsu560 avatar klosax avatar kallisti5 avatar abetlen avatar asukaminato0721 avatar aristotaloss avatar ocordeiro avatar eyusupov avatar mudler avatar salahiguiliz avatar koogle avatar lukasmoellerch avatar maihd avatar nevinpuri avatar cromwellian avatar skeskinen avatar djinn avatar takuya-takeuchi avatar tanmaysachan avatar apcameron avatar appvoid avatar hidenorly avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.