Git Product home page Git Product logo

bloom-jax-inference's Introduction

BLOOM ๐ŸŒธ Inference in JAX

Structure

CPU Host: as defined in TPU manager

TPU Host: as defined in Host worker

ray: distributes load from CPU host -> TPU hosts

Example usage: run.py

Setting Up a TPU-Manager

The TPU hosts are managed by a single TPU manager. This TPU manager takes the form of a single CPU device.

First, create a CPU VM in the same region as that of the TPU pod. This is important to enable the TPU manager to communicate with the TPU hosts. A suitable device config is as follows:

  1. Region & Zone: TO MATCH TPU ZONE
  2. Machine type: c2-standard-8
  3. CPU platform: Intel Cascade Lake
  4. Boot disk: 256GB balanced persistent disk

SSH into the CPU and set-up a Python environment with the same Python version as that of the TPUs. The default TPU Python version is 3.8.10. You should ensure the Python version of th CPU matches this.

python3.8 -m venv /path/to/venv

If the above does not work, run the following and then repeat:

sudo apt-get update
sudo apt-get install python3-venv

Activate Python env:

source /path/to/venv/bin/activate

Check Python version is 3.8.10:

python --version

Clone the repository and install requirements:

git clone https://github.com/huggingface/bloom-jax-inference.git
cd bloom-jax-inference
pip install -r requirements.txt

Authenticate gcloud, which will require copy-and-pasting a command into a terminal window on a machine with a browser installed:

gcloud auth login

Now SSH into one of the workers. This will generate an SSH key:

gcloud alpha compute tpus tpu-vm ssh patrick-tpu-v3-32 --zone europe-west4-a --worker 0

Logout of the TPU worker:

logout

You should now be back in the CPU host.

bloom-jax-inference's People

Contributors

patil-suraj avatar patrickvonplaten avatar sanchit-gandhi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bloom-jax-inference's Issues

Can this run on NVIDIA-GPU?

As I know, Jax is designed for GOOGLE-TPU, but it can also run on NVIDIA-GPU.
I wonder if jax may be faster or slower than pytorch on GPU for bloom llm inference.

TPU v4 version

Perhaps I overlooked it, but I couldn't find what TPU v4 image your are using for v4-64 pod. Would be a big waste of resources setting up the wrong version ;)

I'm quessing either v2-alpha-tpuv4 or tpu-vm-v4-base? Would love to know for sure.

Aweseome effort again from HuggingFace and BigScience. Truly amazing

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.