Git Product home page Git Product logo

llm-graph-builder's Introduction

Knowledge Graph Builder App

This application is designed to convert PDF documents into a knowledge graph stored in Neo4j. It utilizes the power of OpenAI's GPT/Diffbot LLM(Large language model) to extract nodes, relationships and properties from the text content of the PDF and then organizes them into a structured knowledge graph using Langchain framework. Files can be uploaded from local machine or S3 bucket and then LLM model can be chosen to create the knowledge graph.

Getting started

  1. Run Docker Compose to build and start all components:

    docker-compose up --build
  2. Alternatively, you can run specific directories separately:

    • For the frontend:

      cd frontend
      yarn
      yarn run dev
    • For the backend:

      cd backend
      python -m venv envName
      source envName/bin/activate 
      pip install -r requirements.txt
      uvicorn score:app --reload

To deploy the app and packages on Google Cloud Platform, run the following command on google cloud run:

# Frontend deploy 
gcloud run deploy 
source location current directory > Frontend
region : 32 [us-central 1]
Allow unauthenticated request : Yes
# Backend deploy 
gcloud run deploy --set-env-vars "OPENAI_API_KEY = " --set-env-vars "DIFFBOT_API_KEY = " --set-env-vars "NEO4J_URI = " --set-env-vars "NEO4J_PASSWORD = " --set-env-vars "NEO4J_USERNAME = "
source location current directory > Backend
region : 32 [us-central 1]
Allow unauthenticated request : Yes

Features

  • PDF Upload: Users can upload PDF documents using the Drop Zone.
  • S3 Bucket Integration: Users can also specify PDF documents stored in an S3 bucket for processing.
  • Knowledge Graph Generation: The application employs OpenAI/Diffbot's LLM to extract relevant information from the PDFs and construct a knowledge graph.
  • Neo4j Integration: The extracted nodes and relationships are stored in a Neo4j database for easy visualization and querying.
  • Grid View of source node files with : Name,Type,Size,Nodes,Relations,Duration,Status,Source,Model

Setting up Environment Variables

Create .env file and update the following env variables.
OPENAI_API_KEY = ""
DIFFBOT_API_KEY = ""
NEO4J_URI = ""
NEO4J_USERNAME = ""
NEO4J_PASSWORD = ""
AWS_ACCESS_KEY_ID = ""
AWS_SECRET_ACCESS_KEY = ""
EMBEDDING_MODEL = ""
IS_EMBEDDING = "TRUE" KNN_MIN_SCORE = ""\

Functions/Modules

extract_graph_from_file(uri, userName, password, file_path, model):

Extracts nodes , relationships and properties from a PDF file leveraging LLM models.

Args:
 uri: URI of the graph to extract
 userName: Username to use for graph creation ( if None will use username from config file )
 password: Password to use for graph creation ( if None will use password from config file )
 file: File object containing the PDF file path to be used
 model: Type of model to use ('Gemini Pro' or 'Diffbot')

 Returns: 
 Json response to API with fileName, nodeCount, relationshipCount, processingTime, 
 status and model as attributes.
neoooo

create_source_node_graph(uri, userName, password, file):

Creates a source node in Neo4jGraph and sets properties.

Args:
 uri: URI of Graph Service to connect to
 userName: Username to connect to Graph Service with ( default : None )
 password: Password to connect to Graph Service with ( default : None )
 file: File object with information about file to be added

Returns: 
 Success or Failure message of node creation
neo_workspace

get_source_list_from_graph():

 Returns a list of file sources in the database by querying the graph and 
 sorting the list by the last updated date. 
get_source

Chunk nodes and embeddings creation in Neo4j

chunking

Application Walkthrough

KGB.mp4

Links

The Public Google cloud Run URL. Workspace URL

llm-graph-builder's People

Contributors

kartikpersistent avatar prakriti-solankey avatar praveshkumar1988 avatar aashipandya avatar rakshita-arora avatar karanchellani avatar jexp avatar vasanthasaikalluri avatar nielsdejong avatar tomasonjo avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.