Git Product home page Git Product logo

gcp_rag_chatbot's Introduction

Chat Application with RAG Feature Toggle and Backend Server

This project consists of a frontend chat application built using React and Material UI, and a backend server built with Express.js, MongoDB, and Google Cloud AI Platform. The frontend allows users to send and receive messages in real-time, automatically scrolls to the latest message, and includes a toggle switch to enable or disable the RAG (Retrieval-Augmented Generation) feature. The backend handles HTTP requests, interacts with a MongoDB database for persisting data, and uses Google Cloud's AI Platform for generating chatbot responses and text embeddings.

"Screenshot"

Features

  • Real-Time Messaging: Users can send and receive messages instantly.
  • Smooth Scrolling: Automatically scrolls to the latest message for a seamless chat experience.
  • RAG Feature Toggle: A switch to enable or disable the RAG feature, which can alter the chatbot's response behavior.
  • Responsive Design: Built with Material UI for a responsive and material design look.
  • Backend Server: Handles HTTP requests, interacts with MongoDB for data persistence, and uses Google Cloud AI Platform for generating chatbot responses.

"Overview"

Refer to Step by Step guide for details in deploying: "StepByStep".

Environment Setup

  • MongoDB: Ensure your MongoDB instance is accessible and you have access to the connection string.
  • Google Cloud AI Platform: Setup your Google Cloud project.
  • Frontend:Make sure to change the parameters in the frontend/config.js file to suit your environment.
  • Backend:Make sure to change the parameters in the backend/config.json file to suit your environment.

MongoDB

Vector Search Index definition

{
  "fields": [
    {
      "numDimensions": 768,
      "path": "embedding",
      "similarity": "euclidean",
      "type": "vector"
    }
  ]
}

Google Cloud AI Platform

To setup and enable the required API and services refer to: https://cloud.google.com/vertex-ai/docs/start/cloud-environment

Installation

Frontend

Clone the repository and install the dependencies:

git clone <repository-url>
cd <project-directory>/frontend
npm install

Backend

Ensure you have MongoDB and Google Cloud credentials configured. Navigate to the backend directory and install the dependencies:

cd <project-directory>/backend
npm install

PDF Processing and Embedding Storage

This section focuses on processing PDF files to extract text, generate embeddings for each sentence, and store these embeddings along with the text in a MongoDB database. This functionality supports the RAG feature by allowing the chatbot to retrieve relevant information from a collection of PDF documents.

Implementation Details

  • PDF Parsing: Utilizes the pdf-parse library to read PDF files and extract text data. Each sentence in the PDF is identified and processed individually.
  • Embedding Generation: Sends each extracted sentence to the /embedding endpoint of the backend server, which generates embeddings using the Google Cloud AI Platform.
  • MongoDB Storage: Each sentence, along with its embedding and metadata (e.g., the file name and page number), is stored in a MongoDB database. This setup facilitates efficient retrieval of relevant information based on query embeddings.

Process Flow

  1. PDF File Reading: Iterates over PDF files stored in a specified directory, reading each file and extracting its text content.
  2. Sentence Extraction and Embedding Generation: Splits the text content into sentences, generates embeddings for each sentence via a REST API call to the backend server, and then stores these embeddings along with the sentence text in MongoDB.
  3. Metadata Handling: Keeps track of the PDF file name and the page number for each sentence to provide context for the chatbot's responses.

Usage

To run this process, ensure your MongoDB instance is accessible and that the backend server is running with the /embedding endpoint configured to accept text and return embeddings. Execute the script to process all PDF files in the specified directory, extracting text, generating embeddings, and storing the data in MongoDB for use by the chatbot.

node processPdf.js

This additional capability enriches the chatbot's responses with information extracted from a predefined set of documents, making it more useful for answering queries with specific, document-based knowledge.

Usage

Start the Frontend

Run the frontend application locally:

npm start

Navigate to http://localhost:3000 to view the application.

Start the Backend Server

Run the backend server:

node server.js

The server will start on the default port 5050, or a port specified by the config.js file.

Contributing

Contributions are welcome! Please feel free to submit a pull request with any improvements or bug fixes.

License

This project is open-sourced under the MIT License.

Credit

Written by: Emil Nildersen
Senior Solutions Architect - MongoDB

gcp_rag_chatbot's People

Contributors

voxic avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.