Git Product home page Git Product logo

kg-gnn-ir's Introduction

KG-GNN-IR

Complex questions requiring multihop reasoning pose distinct challenges in information retrieval. This capstone project explores the use of Graph Neural Networks (GNN) to potentially enhance retrieval strategies for such queries. Central to our approach is the construction of a knowledge graph that organizes information by linking passages to extracted entities and the titles of their source articles. This structuring allows the GNN to leverage the relational data between entities, aiding in the exploration of more effective retrieval strategies. The project is focused on developing and testing this framework to examine how GNNs can be integrated with knowledge graphs to assist in handling complex informational queries.

Getting Started

These instructions will get your copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

Before running the scripts, you'll need to install the required Python packages. You can install all the required packages using the following command:

pip install -r requirements.txt

Default embedding is bge-small-en-v1.5

Installing and Running

Follow these steps to get a development environment running:

Download HotpotQA Dataset

Run the following bash command to download the required HotpotQA dataset:

./dataset/download_datasets.sh

Run Baseline Model

To execute the baseline model with default settings, use:

python hotpotqa_baseline.py

You can customize the script's execution by adjusting the command-line parameters:

python hotpotqa_baseline.py --model_name "YourModelName" --file_name "your_file.json" --top_k --retriever_mode

Build the Knowledge Graph

To construct the knowledge graph from the HotpotQA training file, execute:

python GraphBuilder.py

This will output a JSON file containing the triplets in kgs.json.

Training the Model

To train the model, run:

python train.py

Hyparameters are store at the top of the file.\ The model achieving the highest hit rate will be automatically stored in the output folder.

Built With

PyTorch - An open source machine learning framework.
PyTorch Geometric - A library for deep learning on graph and other irregular structures.
LlamaIndex
HuggingFace

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.