Git Product home page Git Product logo

ask-the_site's Introduction

img

Ask The Site

Using LLMs and Vector Databases for information retrieval over a website

A LangChain Powered app to query the contents of your website.

Sample Usage

Head over to the home page in the web app and follow the instructions

Preloaded Vector DB Demo

video1.mp4

Create your own Vector DB

video2.mp4

Query your own vector db

video3.mp4

Features

  • Crawls the url specified to generate a sitemap that is subsequently used to extract text data from the relevant webpages on that site
  • Filter URL patterns using Regex
  • Creates Vector DB (ChromaDB) of crawled pages using OpenAi embeddings
  • Uses vector db:
    • To perform Similarity Search and fetch relevant links containing the contents similar to query
    • Use GPT-3.5 Turbo to simulate chatbot behaviour on the website's data

Running Locally:

>>> git clone https://github.com/JayantTaneja/Ask-The_Site.git
>>> pip install streamlit
>>> cd Ask-The_Site
>>> pip install -r requirements.txt
>>> streamlit run Home.py

How is it different from ChatGPT?

ChatGPT is a fine tuned GPT model capable of performing dialogue generation or 'chat' based on the knowledge/data stored in its weights. It is incapable of querying over an existing external knowledge base.

Using a vector db however, we can leverage the general knowledge of the LLM (GPT 3.5 turbo, in this case) to gain actionable results.

But Newer GPT-4 Interfaces allow you to search the web?

True, however, they rely on publically available information. Let's suppose you, at your company/organization have some private data that you do not want to expose for the purpose of preventing any data leaks. In such a case, embedding it in a vector db is a good alternative.

Secondly, With the framework LangChain, you have the option of using a locally hosted LLM like LLaMa(assuming you have necessary compute power). This web app aims to showcase the potential of such layouts.

ask-the_site's People

Contributors

jayanttaneja avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.