- install neo4j
- install python
- make and configure .env in root directory
- adjust options in config/config.yaml if necessary
pip install -r requirements.txt
python main.py
โ ๏ธ Current cache limitations:
- Notion-api cache: Designed for session scope caching, using FS cache with long TTL will prevent from fetching updated pages
- Processed pages and links cache: Designed for rapid test and development. Prevents sync of removing of already processed and cached pages and links from graph
Knowledge Nexus is an advanced personal knowledge management system that transforms the way individuals organize, process, and discover insights from their digital content. By leveraging the power of AI and graph databases, this project addresses the challenge of information overload and disconnected data silos that many knowledge workers face in today's digital landscape.
Unlike traditional note-taking or knowledge management tools that rely heavily on manual organization, Knowledge Nexus automates the process of extracting key concepts, generating insights, and creating meaningful connections across your personal knowledge base.
-
Information Overload: Knowledge Nexus cuts through the noise by automatically extracting key entities and insights from various content sources, helping you focus on what's important.
-
Manual Processing Overhead: Traditional tools require significant manual effort to organize and connect information. Knowledge Nexus automates this process, saving you time and cognitive effort.
-
Limited Contextual Understanding: While tools like Obsidian or Roam Research rely on explicit links, Knowledge Nexus uses AI to understand semantic and topical relationships, creating a richer, more nuanced knowledge graph.
-
Disconnected Data Silos: By importing and processing data from various sources into a single, interconnected knowledge graph, Knowledge Nexus bridges the gaps between your different information repositories.
-
Difficulty in Discovering New Connections: The AI-powered system can uncover non-obvious relationships between different pieces of information, potentially leading to new insights or ideas that you might have missed.
- Multi-Source Data Integration: Import content from Notion, Pocket, web pages, and more (extensible architecture for adding new sources).
- AI-Powered Entity and Topic Extraction: Automatically identify and extract key entities and topics from processed content.
- Intelligent Insight Generation: Leverage AI to generate concise insights from your personal knowledge base.
- Semantic Knowledge Graph Construction: Build a comprehensive, interconnected graph of entities, topics, and content using Neo4j, reflecting not just explicit links but semantic relationships.
- Contextual Querying and Exploration: Easily retrieve relevant content and explore connections within your knowledge graph.
- Personalized Knowledge Assistant: Tailored to your specific needs and preferences, helping you find tools, frameworks, and best practices aligned with your views.
Knowledge Nexus is primarily designed for individual users who:
- Deal with large amounts of information from various sources
- Seek to uncover new insights and connections within their knowledge base
- Want to reduce the cognitive overhead of manual knowledge management
- Are looking for a personal research assistant to aid in complex tasks or decision-making
Type | Parse Markdown Text | Parse References | Recursive Parsing |
---|---|---|---|
Page Properties | |||
Title | โ | โ | โ |
Rich Text | โ | โ | โ |
Select | โ | N/A | N/A |
Status | โ | N/A | N/A |
Multi-select | โ | N/A | N/A |
Number | โ | N/A | N/A |
Date | โ | N/A | N/A |
People | โ | N/A | N/A |
Files | โ | โ | N/A |
Checkbox | โ | N/A | N/A |
URL | โ | โ | โ |
โ | N/A | N/A | |
Phone Number | โ | N/A | N/A |
Formula | โ | N/A | N/A |
Relation | โ | โ | โ |
Rollup | โ | N/A | N/A |
Created Time | โ | N/A | N/A |
Created By | โ | N/A | N/A |
Last Edited Time | โ | N/A | N/A |
Last Edited By | โ | N/A | N/A |
Unique ID | โ | N/A | N/A |
Verification | โ | N/A | N/A |
Database Properties | |||
Title | โ | โ | โ |
Rich Text | N/A | N/A | N/A |
Select | โ | N/A | N/A |
Multi-select | โ | N/A | N/A |
Date | N/A | N/A | N/A |
People | N/A | N/A | N/A |
Files | N/A | N/A | N/A |
Checkbox | N/A | N/A | N/A |
URL | N/A | N/A | N/A |
N/A | N/A | N/A | |
Phone Number | N/A | N/A | N/A |
Formula | N/A | N/A | N/A |
Relation | โ | โ | โ |
Rollup | N/A | N/A | N/A |
Created Time | โ | N/A | N/A |
Created By | โ | N/A | N/A |
Last Edited Time | โ | N/A | N/A |
Last Edited By | โ | N/A | N/A |
Blocks | |||
Paragraph | โ | โ | โ |
Heading 1 | โ | โ | โ |
Heading 2 | โ | โ | โ |
Heading 3 | โ | โ | โ |
Bulleted List Item | โ | โ | โ |
Numbered List Item | โ | โ | โ |
To-do | โ | โ | โ |
Toggle | โ | โ | โ |
Code | โ | โ | N/A |
Quote | โ | โ | โ |
Callout | โ | โ | โ |
Mention (except mentions of page blocks) | โ | โ | N/A |
Equation | โ | N/A | N/A |
Bookmark | โ | โ | N/A |
Image | โ | โ | N/A |
Video | โ | โ | N/A |
Audio | โ | โ | N/A |
File | โ | โ | N/A |
โ | โ | N/A | |
Embed | โ | โ | N/A |
Link Preview | โ | โ | N/A |
Divider | โ | N/A | N/A |
Table of Contents | โ | N/A | N/A |
Breadcrumb | โ | N/A | N/A |
Column List | โ | N/A | N/A |
Column | โ | N/A | N/A |
Synced Block | โ | โ | โ |
Template | โ | โ | โ |
Link to Page | โ | โ | โ |
Table | โ | N/A | N/A |
Table Row | โ | N/A | N/A |
Child Page | โ | โ | โ |
Child Database (except linked and views) | โ | โ | โ |
Comments | โ | โ | โ |
- Enhanced personalization through adaptive learning of user preferences and interests
- Integration with additional productivity tools and data sources
- Advanced visualization options for exploring your knowledge graph
-
Data Source Integration
- Implement Notion API client
- Notion: Process only pages with last_edited_time > than last_edited_time in graph
- Develop Pocket API integration
- Create a web scraper for processing URLs
- Create unified data models
- Design a unified interface for data source processors
-
Content Processing
- Implement entity extraction using NLP techniques
- Develop an insight generation module using LLMs
- Create a content summarization feature
- Add file caching of raw and processed content
- Create content embeddings for semantic search
-
Knowledge Graph Management
- Set up Neo4j database integration
- Implement node and relationship creation logic
- Develop query methods for retrieving related content
- Create visualizations for the knowledge graph
-
AI Agents
- Design a flexible AI agent architecture
- Implement specific agents for entity extraction, insight generation, and summarization
- Develop a system for managing different LLM models and configurations
-
Pipeline Orchestration
- Create a modular pipeline for processing content from ingestion to storage
- Implement error handling and logging throughout the pipeline
- Develop a system for incremental updates and change detection
-
User Interface
- Add Streamlit for interacting with the system
- Implement natural language querying of the knowledge graph
- Create a web-based dashboard for visualizing insights and connections (InfraNodus-like)
-
Testing and Quality Assurance
- Develop unit tests for each module
- Implement integration tests for the entire pipeline
- Create a suite of sample data for testing and demonstration
- Evaluate entity extraction using different models, different contexts, and different prompts
- Evaluate relation and weights creation using different models, different contexts, and different prompts
- Evaluate GraphRAG using different embeddings and prompts
-
Advanced Features
- Implement token-cost estimation
- Implement langfuse for agents and flow evaluation
- Awesome-LLM-KG - A collection of papers and resources about unifying large language models (LLMs) and knowledge graphs (KGs).
- GraphRAG -Microsoft's GraphRAG research paper and implementation
Knowledge Nexus is currently a personal project, but ideas and suggestions are welcome! Feel free to open an issue for discussion or submit a pull request with proposed changes.
Knowledge Nexus is designed with your privacy in mind. All data is stored locally on your machine. The only external service used is OpenAI's API for AI processing, which is subject to their privacy policy and data handling practices.
Empower your mind, uncover hidden insights, and navigate your personal sea of knowledge with unprecedented ease. Welcome to Knowledge Nexus โ where your information comes to life!