Comments (7)
Embeddings cost $0.0004 / 1K tokens so they are very cost-effective. (1 token is approx 3/4 words)
Using openai's (tokenizer)[https://platform.openai.com/tokenizer?view=bpe], you can see how tokens are calculated.
For example, if you're embedding 50-page PDF that's approx 25,000 words. Which is approx. 33,000 tokens ~ $ 0.001
For context, it costs approx 0.48 USD per 1.2 million tokens embeddings.
With respect to pinecone pricing, the free tier is very generous, but for paid/production level pricing is here
As for the gpt-4 calls, I will continue to review the intermediate steps and get back on that shortly.
from gpt4-pdf-chatbot-langchain.
Hello,
Could you give us an idea of the total costs for the 56-Page documents given 1 query:
- creating the embedding (a one time step)
- storing the embeddings in Pinecone
- matching a query of 250 tokens vs. the embedding: costs of ADA, and costs of the query to Pinecone
- the first query to gpt4: chat history + the query
- the second query to gpt4: standalone question + relevant documents
It seems like a lot of queries, it would be very helpful to have an idea about these costs.
Btw, thank you for this tutorial !
Let me look into this and get back to you shortly.
from gpt4-pdf-chatbot-langchain.
This is a fantastic idea!
Maybe adding a small counter of dollars spent in the front-end can save you from a heart attack when the credit card bill rolls in
from gpt4-pdf-chatbot-langchain.
About Pinecone pricing, it would be possible to switch to pgvector for a self-hosting.
from gpt4-pdf-chatbot-langchain.
Also curious about this! Not sure how much money I'd burn through if I used this.
from gpt4-pdf-chatbot-langchain.
About Pinecone pricing, it would be possible to switch to pgvector for a self-hosting.
I think there are a good number of vector database alternatives referenced by OpenAI in the chatgot retrieval plugin repository. They didn't mention pgvector, but I wonder if it's possible to plug weaviate or redis in here.
from gpt4-pdf-chatbot-langchain.
Hi, @databill86! I'm Dosu, and I'm here to help the gpt4-pdf-chatbot-langchain team manage their backlog. I wanted to let you know that we are marking this issue as stale.
From what I understand, you are requesting information on the total costs associated with processing a 56-page PDF document with one query. There have been discussions about the cost-effectiveness of embeddings and the pricing of Pinecone, as well as a suggestion to switch to pgvector for self-hosting. However, the issue remains unresolved.
Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the gpt4-pdf-chatbot-langchain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.
Thank you for your understanding and contribution to the project!
from gpt4-pdf-chatbot-langchain.
Related Issues (20)
- how do i ssl? HOT 3
- Conversion of code into Python HOT 1
- PineconeError: Error, message length too large: found 5453452 bytes, the limit is: 4194304 bytes HOT 6
- source output HOT 1
- How to include more than 4 results from Pinecone? HOT 3
- How to change the BaseUrl if I use a proxy HOT 1
- Can ChatGPT 3.5 be supported? HOT 6
- Text words overlay display HOT 1
- Does this project accepts image read from PDF? HOT 5
- Enhancement - ability to use a graph database such as neo4j instead of vector database HOT 1
- enhancement - integrate with llamaindex HOT 3
- Explain data ingestion code. HOT 4
- s HOT 1
- "TypeError: Cannot read properties of undefined (reading 'text')" HOT 1
- error TypeError: ids is not iterable HOT 1
- Add support for Pinecone Serverless HOT 5
- Error: Azure OpenAI API instance name not found HOT 3
- FetchError: request to https://api.openai.com/v1/embeddings failed HOT 1
- run "yarn run ingest" Japanese punctuation marks were converted to Korean HOT 1
- I get this error when I open my local server: Cannot read properties of undefined (reading 'text') HOT 14
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gpt4-pdf-chatbot-langchain.