Git Product home page Git Product logo

herd's Introduction

Herd

A group of Llamas.

Leverage Mixture of Expert (MoE) models for large language models (LLMs) and enhance their performance with advanced PEFT methods.

Overview

This project is part of an MSc thesis focused on exploring MoE models for LLMs. By leveraging PEFT methods like LoRA and QLoRA, it seeks to offer an effective solution to use a base model in combination with an extensive set of adapters and a method to identify the appropriate adapter.

The basic idea behind this is to come up with a solution that allows for using a base model + a large set of adapters + a method to find the right adapter.

Key Features

  • Prompt-Expert Mapping: Determines the right expert based on the input prompt's distance from expert centroids.
  • Combination of Adapters: Allows merging multiple adapters based on input prompt proximity to each expert's centroid.

Inspiration & Credits

The first iteration of this project is heavily based on airoboros/lmoe. Additionally, I have also included the option to combine multiple adapters according to the distance between the input prompt and the centroid for each expert (and it seems like @aicrum had a similar idea).

Experts & Segmentation

The experts are fine-tuned using QLoRA on the jondurbin/airoboros-2.1 dataset using the same segmentation as in the original project.

Expert Name Categories
qa quiz, multiple_choice, contextual, counterfactual_contextual
creative card, writing, experience, song, roleplay, gtkm, rp, detailed_writing, joke
code coding
reasoning cot, theory_of_mind, riddle, orca
function agent, plan
general wordgame, trivia, general

Fine-tuning Experts

To fine-tune:

python scripts/main.py finetune

Computing Expert-Prompt Distance

To compute the distance between the input prompt and each expert we:

  1. Sample a few instructions from each expert and compute the average embedding for each expert.
  2. Save the average embedding for each expert as a numpy array in experts/

When a new prompt is received, we:

  1. Load the average embedding for each expert.
  2. Use faiss to compute the distance between the input prompt and each expert.

API Documentation

Herd provides a simple REST API. It is based on OpenAI's API.

python scripts/app.py

The following options can be used:

  • --port (-p): The port to run the server on. Default: 8000
  • --host (-i): The host to run the server on. Default: 127.0.0.1
  • --config-file: The config file to use. Default: config_experts.ini
  • --only-base: Only use the base model. Default: False

Querying the Model

To query the model we can run:

curl -s -XPOST http://127.0.0.1:8000/v1/chat/completions -H 'content-type: application/json' -d '{
    "model": "herd",
    "messages": [
        {
          "role": "system",
          "content": "A chat."
        },
        {
          "role": "user",
          "content": "Lorem ipsum dolor sit amet"
        }
      ]
    }'

The following options can be passed:

  • model (str): The name of the model.
  • messages (List[Dict[str, str]]): The list of messages in the chat.
  • temperature (float, optional): The temperature for generating responses. Defaults to 0.5.
  • top_k (int, optional): The number of top-k tokens to consider. Defaults to 50.
  • top_p (float, optional): The cumulative probability for generating responses. Defaults to 1.0.
  • repetition_penalty (float, optional): The repetition penalty for generating responses. Defaults to 1.0.
  • stop (List[str], optional): The list of stop words. Defaults to DEFAULT_STOPS.
  • max_tokens (int, optional): The maximum number of tokens in the response. Defaults to None.
  • top_experts (int, optional): The number of top experts to consider. Defaults to 1.

herd's People

Contributors

alexrs avatar freakanth avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

freakanth

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.