Git Product home page Git Product logo

mosaicml-examples's Introduction

MosaicML Examples

This repo contains reference examples for training ML models quickly and to high accuracy. It's designed to be easily forked and modified.

It currently features the following examples:

Installation

To get started, either clone or fork this repo and install whichever example[s] you're interested in. E.g., to get started training GPT-style Large Language Models, just:

git clone https://github.com/mosaicml/examples.git
cd examples # cd into the repo
pip install -e ".[llm]"  # or pip install -e ".[llm-cpu]" if no NVIDIA GPU
cd examples/llm # cd into the specific example's folder

Available examples include llm, stable-diffusion, resnet-imagenet, resnet-cifar, bert, deeplab, nemo, gpt-neox, and trlx.

Extending an example

Each example provides a short main.py that constructs a Trainer object and all of the arguments to it. There are three easy ways to extend an example:

  1. Change the configuration. Each example has a yamls subdirectory, and each yaml file therein contains settings like the learning rate, where to load data from, and more. These settings are read within main.py and used to configure the Trainer.
  2. Modify the arguments to Trainer directly. For example, if you want to use a different optimizer, the simplest way is to construct this optimizer in main.py and pass it as the optimizer argument to the Trainer.
  3. Write an Algorithm and add it to the Trainer's algorithms argument. This lets you inject arbitrary code almost anywhere in your training loop and modify training as it happens. Composer includes a big list of Algorithms whose source code you can use as examples.

We also provide some convenient string-to-object mappings in common/builders.py. These let you avoid cluttering main.py with code like:

if cfg.optimizer == 'adam':
   ...
elif cfg.optimizer == 'sgd':
  ...

and instead write:

opt = builders.build_optimizer(cfg.optimizer, model)

with all the if-elif logic wrapped in this reusable function. You can easily extend these functions to include new options; e.g., to add an optimizer, you could extend build_optimizer to import and construct your optimizer.

If you run into any issues extending the code, or just want to discuss an idea you have, please open an issue or join our community Slack!

Tests and Linting

If you already have the dependencies for a given example installed, you can just run:

pre-commit run --all-files  # autoformatting for whole repo
cd examples/llm  # or bert, resnet_imagenet, etc
pyright .  # type checking
pytest tests/  # run tests

from the example's directory.

To run the full suite of tests for all examples, invoke make test in the project's root directory. Similarly, invoke make lint to autoformat your code and detect type issues throughout the whole codebase. This is much slower than linting or testing just one example because it installs all the dependencies for each example from scratch in a fresh virtual environment.

Overriding Arguments

These examples use OmegaConf to manage configuration. OmegaConf allows us to make configuration explicit via separate YAML files while also allowing rapid experimentation and easy command line override. There's no special language or format for these configurations; they're just a convenient way of writing out a dictionary that gets used in each main.py file.

Here's a simple example. Let's say you have this YAML config file:

a: 1
nested:
    foo: bar

and we run:

python script.py b=baz nested.foo=different

The main.py file will end up with:

{'a': 1, 'b': 'baz', 'nested': {'foo': 'different'}}

Examples

This repo features the following examples, each as their own subdirectory:

ResNet-50 + ImageNet

drawing

Figure 1: Comparison of MosaicML recipes against other results, all measured on 8x A100s on the MosaicML platform.

Train the MosaicML ResNet, which is currently the fastest ResNet50 implementation there is and yields a โœจ 7x โœจ faster time-to-train than a strong baseline.

๐Ÿš€ Get started with the code here.

DeepLabV3 + ADE20k

drawing

Train the MosaicML DeepLabV3, which yields a โœจ5xโœจ faster time-to-train than a strong baseline.

๐Ÿš€ Get started with the code here.

Large Language Models (LLMs)

Training curves for various LLM sizes.

A simple yet feature-complete implementation of GPT, which scales to 70B parameters while maintaining high performance on GPU clusters. Flexible code, written with vanilla PyTorch, that uses PyTorch FSDP and some recent efficiency improvements.

๐Ÿš€ Get started with the code here.

BERT

This benchmark covers both pre-training and fine-tuning a BERT model. With this starter code, you'll be able to do Masked Language Modeling (MLM) pre-training on the C4 dataset and classification fine-tuning on GLUE benchmark tasks.

We also provide the source code and recipe behind our Mosaic BERT model, which you can train yourself using this repo.

๐Ÿš€ Get started with the code here.

mosaicml-examples's People

Contributors

vchiley avatar abhi-mosaic avatar dakinggg avatar alextrott16 avatar dblalock avatar samhavens avatar landanjs avatar mvpatel2000 avatar bcui19 avatar bandish-shah avatar a-jacobson avatar dskhudia avatar hanlint avatar bmosaicml avatar aspfohl avatar codestar12 avatar lupesko avatar growlix avatar nik-mosaic avatar galtay avatar knighton avatar mrseeker avatar rr4787 avatar sashadoubov avatar vladd-i avatar corymosaicml avatar jacobfulano avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.