Comments (2)
To optimize the extract_titles()
function and avoid redundant LLM calls when there is only one node per document, you can modify the function as follows:
async def extract_titles(self, nodes_by_doc_id: Dict) -> Dict:
titles_by_doc_id = {}
for key, nodes in nodes_by_doc_id.items():
if len(nodes) == 1:
# Directly use the title from the single node
titles_by_doc_id[key] = await self.llm.apredict(
PromptTemplate(template=self.node_template),
context_str=cast(TextNode, nodes[0]).text,
)
else:
title_candidates = await self.get_title_candidates(nodes)
combined_titles = ", ".join(title_candidates)
titles_by_doc_id[key] = await self.llm.apredict(
PromptTemplate(template=self.combine_template),
context_str=combined_titles,
)
return titles_by_doc_id
This modification checks if there is only one node for a document and directly uses the title from that node, avoiding the need to combine multiple titles. This approach leverages the node_template
for single nodes and the combine_template
for multiple nodes, as described in the TitleExtractor
class [1][2].
Details
About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
To continue the conversation, mention @dosu.
from llama_index.
My comment was meant for the maintainers of the repo.
from llama_index.
Related Issues (20)
- [Question]: how to load multiple html files to get unstructured and structured table in a html page HOT 7
- [Feature Request]: Include multimodal LLM capability for Bedrock models
- [Feature Request]: llama-index can't reduce the dim of embedding vector, need to add option to control num of vec features HOT 1
- [Question]: AzStorageBlobReader Integration with LlamaParse HOT 1
- Inconsistency in Node Usage Between ElasticSearch and ChromaDB for VectorStoreIndex in Dense_x Retrieval HOT 1
- [Bug]: TypeError: ColbertIndex._build_index_from_nodes() got an unexpected keyword argument 'index_name' HOT 3
- [Question]: Getting corresponding retrieved document information (metadata, text...) from a KnowledgeGraphRAGRetriever Query HOT 8
- [Question]: Where do I calculate the similarity between nodes in query and vector db when using faiss? HOT 1
- [Bug]: model_name vs. model in BedrockEmbedding doesn't throw an error HOT 2
- [Bug]: Perplexity LLM integration incorrectly recognizes chat model as a completion model HOT 1
- [Bug]: poetry version incompatibility between llama-index and core HOT 3
- [Question]: How do I load a KnowledgeGraphIndex from an existing Neptune Knowledge Graph without from_documents()? HOT 4
- [Bug]: FunctionCallingProgram ignores tool_choice parameter HOT 2
- [Bug]: Example for JSONalyze Query Engine fails with OperationalError: near "sql": syntax error HOT 3
- Inconsistent Embedding Reuse Behavior in VectorStoreIndex with DenseXRetrievalPack HOT 1
- [Question]: Benchmarking time it takes to chunk , embed documents , insert into vector database HOT 1
- [Question]: Inconsistency in Node Usage Between ElasticSearch and ChromaDB for VectorStoreIndex in Dense_x Retrieval HOT 2
- [Question]: multiple-PDF with local LLM llama3 HOT 2
- Multi Document Retriever not retrieving all information HOT 7
- [Bug]: ElasticSearch Vector Store : Timeout context manager should be used inside a task HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama_index.