Comments (2)
Hey there, @syedfahimabrar! I'm here to help you with any bugs, questions, or contribution-related matters. Let me know how I can assist you!
To resolve the issue of the Mistral-7b model adding a duplicate BOS token, which results in a warning and exceeds the context window when using TreeSummarize
with llamacpp
, you need to ensure that the BOS token is not added multiple times during the repacking of text chunks. This can be managed by modifying the messages_to_prompt
function to handle the BOS token correctly.
Here is the relevant part of the messages_to_prompt
function that ensures the prompt starts with a system prompt and avoids duplicate BOS tokens:
def messages_to_prompt(messages):
prompt = ""
for message in messages:
if message.role == 'system':
prompt += f"<|system|>\n{message.content}</s>\n"
elif message.role == 'user':
prompt += f"<|user|>\n{message.content}</s>\n"
elif message.role == 'assistant':
prompt += f"<|assistant|>\n{message.content}</s>\n"
# ensure we start with a system prompt, insert blank if needed
if not prompt.startswith("<|system|>\n"):
prompt = "<|system|>\n</s>\n" + prompt
# add final assistant prompt
prompt = prompt + "<|assistant|>\n"
return prompt
This function ensures that the prompt starts with a system prompt and avoids adding a duplicate BOS token by checking if the prompt already starts with "<|system|>\n"
. If it does not, it prepends "<|system|>\n</s>\n"
to the prompt [1].
Additionally, you can modify the repack
method in the PromptHelper
class to handle the BOS token correctly:
class PromptHelper:
def repack(self, summary_template, text_chunks):
# Ensure BOS token is not duplicated
processed_chunks = []
for chunk in text_chunks:
if chunk.startswith(BOS_TOKEN):
chunk = chunk[len(BOS_TOKEN):]
processed_chunks.append(chunk)
# Repack the chunks to fit the context window
# (Assuming repack logic here)
repacked_chunks = self._repack_logic(summary_template, processed_chunks)
return repacked_chunks
def _repack_logic(self, summary_template, text_chunks):
# Implement the logic to repack text chunks to fit the context window
# This is a placeholder for the actual repacking logic
pass
Make sure to replace BOS_TOKEN
with the actual BOS token used by the Mistral-7b model. This modification ensures that the BOS token is not duplicated during the repacking process, thus preventing the warning and context window overflow [2].
By implementing these changes, you should be able to avoid the issue of duplicate BOS tokens and the resulting context window overflow.
To continue talking to Dosu, mention @dosu.
from llama_index.
@syedfahimabrar set the tokenizer so that the token counting is correct
Settings.tokenizer = AutoTokenizer.from_pretrained(
# I used llama2 here, use whatever model matches for you
"NousResearch/Llama-2-7b-chat-hf"
)
from llama_index.
Related Issues (20)
- [Feature Request]: 'aget_text_embedding_batch' controls concurrency by semaphore.
- [Question]: Recursive retriever for ObjectIndex class HOT 1
- [Question]: firecrawl_reader error HOT 3
- [Feature Request]: Global progress bar for index constructing
- [Bug]: ChromaVectorStore ._get() function extracts only the first character of the node id_ HOT 5
- [Bug]: Something needs to be updated with chromadb HOT 5
- [Bug]: No module named llama_index HOT 3
- [Question]: HuggingFace model from local file path HOT 4
- [Bug]: Not able to import Llama index modules in tensor rt llms docker image. HOT 1
- [Bug]: The documentation bot is not updated HOT 1
- [Question]: Why when I query my index, does it only index 2 files when I have 10 csvs? HOT 5
- [Question]: how to load multiple html files to get unstructured and structured table in a html page HOT 7
- [Feature Request]: Include multimodal LLM capability for Bedrock models
- [Feature Request]: llama-index can't reduce the dim of embedding vector, need to add option to control num of vec features HOT 1
- [Question]: AzStorageBlobReader Integration with LlamaParse HOT 1
- Inconsistency in Node Usage Between ElasticSearch and ChromaDB for VectorStoreIndex in Dense_x Retrieval HOT 1
- [Bug]: TypeError: ColbertIndex._build_index_from_nodes() got an unexpected keyword argument 'index_name' HOT 3
- [Question]: Getting corresponding retrieved document information (metadata, text...) from a KnowledgeGraphRAGRetriever Query HOT 8
- [Question]: Where do I calculate the similarity between nodes in query and vector db when using faiss? HOT 1
- [Bug]: model_name vs. model in BedrockEmbedding doesn't throw an error HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama_index.