Comments (6)
To address the "Input too long for selected model" error you're encountering, even after implementing the recent fix and adjusting your chunk size to 400 with an overlap of 80, consider the following steps:
-
Check Model Capabilities: Ensure the model (
cohere.embed-english-v3
) supports the chunk sizes you're using. Different models have varying input size limitations. -
Adjust Chunk Parameters: If some chunks are still too large, reduce the chunk size further or adjust the overlap. This might help keep the chunks within the model's acceptable input range.
-
Explore Other Models: If adjusting sizes doesn't work, look into using a different model that can handle larger inputs. The
list_supported_models
method can help identify suitable alternatives. -
Customize Request Body: For the Cohere provider, you might need to customize how the request body is constructed, particularly how text is split and sent. This could involve adjusting settings like
truncate
. -
Review API and SDK Settings: Ensure your API requests are optimally configured, which might help with processing large inputs more efficiently.
-
Consult Documentation and Support: If issues persist, the official documentation or support channels for the models you're using may offer additional insights or solutions.
By following these steps, you should be better equipped to resolve the issue with input size when making embeddings of a PDF file.
Details
Sources
- [Question]: ModelError: Your input is too long. Max input length is 4096 tokens, but you supplied 5441 tokens.
- Chunk size sometimes exceeds max model size
- llama-index-legacy/llama_index/legacy/embeddings/utils.py
- llama-index-legacy/llama_index/legacy/embeddings/bedrock.py
- llama-index-core/llama_index/core/service_context.py
- llama-index-legacy/llama_index/legacy/service_context.py
- docs/docs/module_guides/indexing/vector_store_index.md
- llama-index-integrations/indices/llama-index-indices-managed-vectara/llama_index/indices/managed/vectara/base.py
- llama-index-core/llama_index/core/indices/base.py
- llama-index-core/llama_index/core/indices/vector_store/base.py
- llama-index-integrations/embeddings/llama-index-embeddings-bedrock/llama_index/embeddings/bedrock/base.py
About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
To continue the conversation, mention @dosu.
from llama_index.
I suggest to set the chunk_size and similar parameters not with ServiceContext but with Settings.
I link documentation about it https://docs.llamaindex.ai/en/stable/module_guides/supporting_modules/service_context_migration/
from llama_index.
@gsuriano, Thanks for the suggestion. I will migrate to the Settings approach for sure; however, I am still encountering the same error.
Perhaps this limit is another constraint because I have conducted some testing, and it begins to fail when the total character count of "texts" is close to 8192 characters.
from llama_index.
To effectively change embed_batch_size you should use the Settings object. That's why I suggested using Settings also for the chunk_size, I think that the split is done with default values of llamaindex and not with the chunk_size that you set on the ServiceContext.
from llama_index.
@gsuriano, Thank you for the advice. I was able to change the embed_batch_size via Settings. However, the error still persists even when I set embed_batch_size=1, so apparently, this is not the problem.
Upon debugging, I found that the error starts to occur at this node:
{"texts": ["page_label: 2\ngenai_document_id: 1681b55d-a16b-4fe7-9aa5-a8edf60501b6\ngenai_tenant_id: 5853\ngenai_created_date_utc: 2024-05-09T16:06:43.000Z\ngenai_key1: value 1\ngenai_key2: value 2\ngenai_application: CE\ngenai_entitytype: Program\ngenai_entityid: 51137\ngenai_llmmodel: anthropic.claude-3-haiku-20240307-v1:0\ngenai_embeddingmodel: cohere.embed-english-v3\ngenai_embeddingdimension: 1024\n\nIT Change Management \nUTRGV 2 \n \nTable of Contents \nIntroduction ............................................................................................................................................................. 3 \nDefining Change .................................................................................................................................................... 3 \nRoles and Responsibilities ..............."], "input_type": "search_document", "truncate": "NONE"}
The "texts" field has only 848 characters. However, the error I encountered is:
File "C:\work\LambdaModules10\Lib\site-packages\llama_index\embeddings\bedrock\base.py", line 345, in _get_embedding
response = self._client.invoke_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\work\LambdaModules10\Lib\site-packages\botocore\client.py", line 565, in _api_call
return self._make_api_call(operation_name, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\work\LambdaModules10\Lib\site-packages\botocore\client.py", line 1021, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the InvokeModel operation: Input is too long for requested model.
If "texts" has fewer characters, it works without issues.
I would appreciate any suggestions.
from llama_index.
I have finally found the issue.
This is caused by the TRUNCATE: 'NONE' parameter in this code (llama_index\embeddings\bedrock\base.py):
request_body = json.dumps( { "texts": payload, "input_type": input_types[input_type], "truncate": "NONE", } )
If you remove this truncate parameter, it works! In the Amazon Documentation, they said the default value is NONE, but when it is explicitly specified as "NONE", it causes the "input too long" error. It appears to be an internal bug in Amazon Bedrock.
Anyway, if the default value is NONE, removing the truncate parameter should have the same behavior in theory. Source: https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-embed.html
I have created a pull request with this fix.
from llama_index.
Related Issues (20)
- ePIC gAMES fORTNITE v bUCKS gENERATOR: cLAIM YOUR fREE VbUCKS
- ~!!(NEW-CODE)~!!The Ultimate V Bucks Code Generator Guide for Fortnite Epic Players
- ~!!(NEW-CODE)~!!The Ultimate V Bucks Code Generator for Fortnite Epic Players
- **100% SURE**Unlocking V Bucks Code: The Ultimate List of Free V Bucks Codes for Fortnite
- **100% UPDATE**Unlocking V Bucks Code: The Ultimate List of Free V Bucks Codes for Fortnite
- Fortnite Free V Bucks Code Generator 2024 *get* v bucks generator 2024 free vbucks codes
- fREE fORTNITE V bUCKS gENERATOR 2024 {Latest Codes Daily}Fortnite Free Skin Codes For May 2024
- Redeem$$Free~ V-Bucks codes Tools IN 2024!:Every Player Can get Fortnite Codes In-Game Item
- V-BUCKS-GENERATOR-2024-FREE-VBUCKS-CODES at {x%32} HOT 1
- [Bug]: Where is the InsertDemo.ipyb in official website? HOT 2
- [Question]: index.insert(new_docuemnts) performance HOT 5
- [Bug]: Query Engine gives incomplete streaming response when using Gemini LLMs HOT 1
- [Question]: Cannot retrieve relevant documents with faiss HOT 2
- [Bug]: ModuleNotFoundError: No module named 'llama_index.tools' HOT 2
- [Bug]: Cohere in Bedrock with Pinecone HOT 1
- [Bug]: OpenSearch Vector Store have a lot of connection problem after #11513 HOT 2
- [Bug]: I encountered an issue while running the official website process HOT 3
- [Bug]: Redis pipeline with Docstore fails to run in Async HOT 2
- [Bug]: Support for "gpt-4" Model in Llama Index (AzureOpenAIMultiModal) HOT 3
- [Bug]: AttributeError: 'PineconeVectorStore' object has no attribute 'service_context' HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama_index.