Git Product home page Git Product logo

course-intro-to-qa-systems-with-llms's Introduction

Intro to Q&A Systems with Large Language Models

Setting Up Dependencies

source setup.sh

And when done,

deactivate

Setting up environment variables

Create a .env file in this repo. Add yur keys and secrets to download your data there:

OPENAI_API_KEY=
ANTHROPIC_API_KEY=
MLOPS_DATA_URL=

Hello Milo

streamlit run introduction/hello_milo.py

Course Proof-of-concept Prototype

The proof of concept prototype for the course is in the folder poc/:

  1. First run the notebook here to understand the code.
  2. Then, run the PoC a. First download the data with python poc/download_chats.py. b. Then, build the index with the data pre-processing pipeline in python poc/build_index.py c. Run Milo assistant with streamlit run poc/milo.py

Optional Labs

Hello Milo

A simple MLOps Q&A bot using OpenAI directly. Note: DOES NOT USE RETRIVAL-AUGMENTED GENERATION.

streamlit run introduction/hello_milo.py

Q&A on Video

A Q&A that answers questions based on a video transcript. Note: DOES NOT USE RETRIVAL-AUGMENTED GENERATION.

This is one example of RAG, where the entire transcript is the retrieved context. Since transcripts are large, we need a LLM with a large window - for this we use Anthropic's Claude.

Make sure you have your ANTHROPIC_API_KEY set in your .env file.

streamlit run video/video_milo.py

e.g. Use https://www.youtube.com/watch?v=0e5q4zCBtBs and questions about the panel discussion.

Q&A from blog articles

Another example of RaG from blog data where we answer questions based on data on blugs that are publicly available.

a. First download the data with python blog/download_blogs.py. b. Then, build the index with the data pre-processing pipeline in python blog/build_index.py c. Run Milo assistant with streamlit run blog/blog_milo.py

You can also change the blog in download_blogs.py:

PAGES = [
    "https://mlops.community/building-the-future-with-llmops-the-main-challenges/",
]

NOTE: the html page contains a lot of data. This is where data cleanup comes in. Feel free to clean up the data manually or with a script to see improved performance.

course-intro-to-qa-systems-with-llms's People

Contributors

rparundekar avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.