Comments (1)
To resolve the RuntimeError
related to a duplicated primary key value "Summer founders program" when creating a KnowledgeGraphIndex
from documents using the llama_index
package, you need to handle duplicates in the document store. You can use the upsert_triplet_and_node
method from the KnowledgeGraphIndex
class to handle duplicates when adding nodes and triplets.
Here is an example of how to use this method:
from llama_index.legacy.indices.knowledge_graph.base import KnowledgeGraphIndex
from llama_index.legacy.schema import TextNode
from llama_index.legacy.service_context import ServiceContext
# Create an instance of KnowledgeGraphIndex
index = KnowledgeGraphIndex([], service_context=ServiceContext())
# Define your triplets and nodes
triplets = [
("foo", "is", "bar"),
("hello", "is not", "world"),
("Jane", "is mother of", "Bob"),
]
nodes = [TextNode(text=str(tup)) for tup in triplets]
# Add triplets and nodes to the index
for triplet, node in zip(triplets, nodes):
index.upsert_triplet_and_node(triplet, node)
This method ensures that duplicates are handled appropriately when adding triplets and nodes to the KnowledgeGraphIndex
.
In the provided test cases, the upsert_triplet_and_node
method is used to handle duplicates effectively. For example:
from unittest.mock import patch
from llama_index.legacy.indices.knowledge_graph.base import KnowledgeGraphIndex
from llama_index.legacy.schema import TextNode
from llama_index.legacy.service_context import ServiceContext
@patch.object(KnowledgeGraphIndex, "_extract_triplets", side_effect=mock_extract_triplets)
def test_build_kg_manual(_patch_extract_triplets, mock_service_context):
index = KnowledgeGraphIndex([], service_context=mock_service_context)
tuples = [
("foo", "is", "bar"),
("hello", "is not", "world"),
("Jane", "is mother of", "Bob"),
]
nodes = [TextNode(text=str(tup)) for tup in tuples]
for tup, node in zip(tuples, nodes):
index.upsert_triplet_and_node(tup, node)
# Verify the nodes and triplets
docstore_nodes = index.docstore.get_nodes(list(index.index_struct.node_ids))
table_chunks = {n.get_content() for n in docstore_nodes}
assert len(table_chunks) == 3
assert "('foo', 'is', 'bar')" in table_chunks
assert "('hello', 'is not', 'world')" in table_chunks
assert "('Jane', 'is mother of', 'Bob')" in table_chunks
# Test inserting the same node twice
node = TextNode(text=str(("foo", "is", "bar")), id_="test_node")
index.upsert_triplet_and_node(tup, node)
index.upsert_triplet_and_node(tup, node)
This test case demonstrates how the upsert_triplet_and_node
method can be used to handle duplicates effectively, ensuring that the RuntimeError
related to duplicated primary key values is resolved [1][2][3][4].
Details
About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
To continue the conversation, mention @dosu.
from llama_index.
Related Issues (20)
- [Question]: multiple-PDF with local LLM llama3 HOT 2
- Multi Document Retriever not retrieving all information HOT 7
- [Bug]: ElasticSearch Vector Store : Timeout context manager should be used inside a task HOT 2
- [Bug]: KnowledgeGraphIndex.from_documents ignoring set llm HOT 8
- [Question]: How were the prompts and scoring logic define for AnswerRelevancyEvaluator, ContextRelevancyEvaluator, AnswerRelevancyEvaluator and FaithfulnessEvaluator HOT 1
- [Feature Request]: Opensearch efficient filtering HOT 1
- [Question]: Index created from a LanceDBVectorStore is not returning all nodes? HOT 11
- [Bug]: HOT 1
- [Bug]: BM25Retriever retrieval nodes, top_n indexs and scores not match HOT 1
- [Question]: how to configure self-signed certificate for ingestion pipeline with OpenAI API HOT 1
- [Documentation]: The llamaindex.ai page directs to 404 Not Found Page for Examples HOT 3
- [Question]: kg index embeddings insertion HOT 3
- [Bug]: TextToCypherRetriever raise error is allowed_output_fields is not specified HOT 3
- [Feature Request]: FHIR Bundle Loader
- [Feature Request]: Gemini 1.5 pro caching
- [Question]: How to use Bedrock Claude 3.5 Sonnet with LlamaIndex HOT 3
- [Question]: load index object attribute HOT 4
- [Feature Request]:
- [Bug]: LlamaIndex-DSPy integration issue when using HuggingFace Embeddings HOT 1
- [Feature Request]: Integration with Autogen
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama_index.