Git Product home page Git Product logo

llama-langchain-rag's Introduction

Llama Langchain RAG Project

  • Course: CSCI-GA.2565
  • Institution: New York University
  • Term: Spring 2024

Overview

The Llama Langchain RAG project is an application designed specifically for fans of the beloved sitcom Friends for fun. Using the power of Retrieval-Augmented Generation (RAG) combined with a Language Model (LLM), this project employs LLaMA 2, fine-tuned with Lora technique using Replicate to provide detailed, contextually accurate answers to complex queries related to content, plot, and characters. The app is deployed using Streamlit, includes session chat history, and provides an option to select multiple LLaMA2 API endpoints on Replicate.

Try our app: friends-rag.streamlit.app/

Sample queries you can use: evaluation.txt

Note on Model Initialization: The first prediction request from fine-tuned models like "Finetuned LLaMA2" and "Finetuned LLaMA2 with RAG" will take longer (expect 3 to 5 minutes) after a period of inactivity due to a "cold boot," where the model needs to be fetched and loaded. Subsequent requests will respond much quicker. More details on cold boots can be found here.

Note: This is the production version of the application and is optimized for deployment. Running it locally may require modifications to suit the development environment.

Getting Started

Prerequisites

  • Relative API key(s) (optional; e.g. for embedding model)
  • Python 3.11 or higher
  • Git Large File Storage (LFS) for handling large datasets and model files

Installation

  1. Install dependencies.

    • [Optional but recommended]
      • Create a virtual python environment with
           python -m venv .venv
        
      • Activate it with
           source .venv/bin/activate
        
    • Install dependencies with
         pip install -r requirements.txt
      
  2. Create the Chroma DB:

python populate_database.py
  1. Setup before being able to do inference:

    • Case 1: If you choose to run the base Llama 2 model locally, you'll need to have Ollama installed and run ollama serve in a seperate terminal.

    • Case 2: If you choose to do inference with replicate with our models locally, you'll need to have REPLICATE_API_TOKEN setup as an environment variable.

    • Case 3: You can simply test run our deployed project on streamlit: friends-rag.streamlit.app.

  2. Test run to query the Chroma DB, the below command will return an output based on RAG and the selected model:

python query_data.py "Which role does Adam Goldberg plays?"
  1. Start the App locally:
streamlit run app.py

In case the file size exceeds Github's recommended maximum file size of 50.00 MB, you may need to use Git Large File Storage.

Configuration & Features:

  1. Finetuning usually involves using a domain related dataset. In this project, we decided to curate our own (Question-Answer) pairs dataset for finetuning and RAG.
  2. Domain-related files (txt and jsonl) are stored in the data folder, such as trivia.txt and s1_s2.jsonl. Using Langchain, a vector database was created in chroma folder based on the data for RAG. More content could be added as needed.
  3. The front-end and deployment is implemented with Streamlit.
  4. Option to select between differnet Llama2 chat API endpoints (base LLaMA2, finetuned LLaMA2, base with RAG, finetuned with RAG).
  5. Each model (base LLaMA2, finetuned LLaMA2, base with RAG, finetuned with RAG) runs on Replicate.

The frontend was refactored from a16z's implementation of their LLaMA2 chatbot.

Resources:

llama-langchain-rag's People

Contributors

guochenmeinian avatar godness645 avatar jy2575 avatar

Stargazers

Muntasir Adnan avatar

Watchers

 avatar

Forkers

adnan525

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.