Git Product home page Git Product logo

context_crafters's Introduction

Fine-tuning Google's Gemma (7b) LLM with Kaggle-documentation synthetic dataset

Introduction

This exploratory project is the work for the group Context Crafters in the Georgia Tech Deep Learning graduate class (Spring 2024). Please visit this page for details of the CS 7643 class.

Disclaimer: The views and opinions expressed in this project are those of the authors and do not necessarily reflect the views or positions of Georgia Tech.

Project Overview

In February 2024, Google released Gemma, a family of lightweight open-source generative AI models, designed primarily for developers and researchers. Using Google's Gemma as the pre-trained model, we attempt to fine-tune this foundation generative model to adapt it to perform better at question-answering content from Kaggle documentation .

Motivation

In recent times the use of large language models (LLMs) for question-answering tasks has seen a significant increase, thanks to the impressive capabilities of models such as ChatGPT. However, foundation LLMs are not inherently adapted to domain-specific tasks and require some techniques to further train the model using a representative domain-specific dataset. The work here shows a possible approach for fine-tuning open-source language models to a specific task.

The problem

Kaggle is an online community for data science and machine learning (ML) that serves as a learning platform for both novices and seasoned professionals, offering realistic practice problems to sharpen data science skills. Currently, navigating its extensive documentation can be daunting for many users, potentially limiting their ability to make the most of the platform. This project aims to illustrate the potential methods for developing an artificial intelligence (AI) assistant specifically tailored for Kaggle’s documentation. Potentially, the Question Answering Assistant can improve user experience significantly and reduce the workload of platform staff by providing immediate responses to user inquiries.

The Challenge

The two primary challenges of fine-tuning a large language model faced by small businesses and individuals include:

  • For effective fine-tuning, the training data must be of high quality, adequately large, and representative of the specific domain and task at hand.
  • Fine-tuning large language models incurs additional expenses related to training and maintaining the specialized model.

We address these and more challenges of fine-tuning a large language model which can be a powerful technique to adapt LLMs to specific domains and tasks.

Implementation Summary

How to use

Please refer to

Authors

  • James Mungai
  • Kevin Kori
  • Krittaprot Tangkittikun

License

Please refer to the terms of use, licenses, copyrights, and proprietary for each of the above tools and datasets respectively.

context_crafters's People

Contributors

iamkevk avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.