Comments (10)
Iām using your embedbase library on python. I wonder how can I use the createContext api like on the embedbase.xyz playground menu. By the way embedbasd is very helpful to me. Thank you very much.
@doanhieu9797 thanks a lot!
Sure! Here's the "createContext" function from the JS SDK translated to Python:
import requests
import json
def create_context(embedbase_api_url, embedbase_key, dataset, query, options=None):
if options is None:
options = {}
limit = options.get("limit", 5)
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {embedbase_key}"
}
search_url = f"{embedbase_api_url}/{dataset}/search"
response = requests.post(search_url, headers=headers, data=json.dumps({"query": query, "top_k": limit}))
data = response.json()
return [similarity["data"] for similarity in data["similarities"]]
# Usage
embedbase_api_url = "https://your_embedbase_url/v1"
embedbase_key = "your_embedbase_key"
dataset = "your_dataset_name"
query = "your_query"
context_data = create_context(embedbase_api_url, embedbase_key, dataset, query, options={"limit": 5})
print(context_data)
Just replace "your_embedbase_url", "your_embedbase_key", "your_dataset_name", and "your_query" with the appropriate values.
FYI, we plan to open source the playground soon, so you can take a look at the code, and turn it into a one-line installable React component,
Can you tell me how this code helps you, please?
If you have prompt size issue, don't hesitate to follow-up, I can help with that part too
from embedbase.
Thank. Great if you open source the playground soon, I think I'll make an AI app to read and analyze my own docs.
from embedbase.
By the way can you please tell me if using splitText to shorten the context will affect chatgpt's answer?
from embedbase.
@doanhieu9797 cool! We actually created a documentation connected to GPT-4
This is the script that send all documentation files to Embedbase
https://github.com/different-ai/embedbase-docs/blob/main/scripts/sync.ts
at every git push on main branch https://github.com/different-ai/embedbase-docs/blob/main/.github/workflows/index.yaml
from embedbase.
By the way can you please tell me if using splitText to shorten the context will affect chatgpt's answer?
We are experimenting with different way of splitting the text. The biggest value is to avoid going over the prompt size. How you split the text depends on the experience you want to create.
For example, if the user asks a question to a documentation "how can I run an eth node on raspberry pi?" it will search for similar information in the chosen dataset(s) and feed it to GPT. I recommend experimenting different splitting size and see what's best for you. We want to create an easy numerical way to evaluate different strategies in the future
from embedbase.
@louis030195 cool! Many thanks. I'm back after busy days, i am wondering how to update a data of documents imported into dataset_ids can you please let me know.
from embedbase.
we don't support updating data at the moment - we're append only.
our recommendation is to create a new dataset and later then query only the new dataset with the updated data. under the hood, we make sure that re-creating datasets is performant and efficient.
PS: this is because embeddings are rarely retrieved using an ID contrary to SQL and NoSQL DBs, we believe it doesn't make sense to retrieve a single embedding through an id and update it.
PPS: we understand this can be a bit of hassle, and are open to implement this in the future if we hear a compelling use case.
from embedbase.
we don't support updating data at the moment - we're append only.
our recommendation is to create a new dataset and later then query only the new dataset with the updated data. under the hood, we make sure that re-creating datasets is performant and efficient.
PS: this is because embeddings are rarely retrieved using an ID contrary to SQL and NoSQL DBs, we believe it doesn't make sense to retrieve a single embedding through an id and update it.
PPS: we understand this can be a bit of hassle, and are open to implement this in the future if we hear a compelling use case.
@hotkartoffel I know but this is very necessary because when I import a lot of data into the dataset and I only want to edit one data, I have to delete the whole dataset and re-import it. It's not reasonable at all.
from embedbase.
@doanhieu9797 we're ready to update our beliefs there.
just a few questions:
a) could you expand a bit about your use of the api (what are you storing, how do you use it, would you mind sharing a sample entry that you store?)
b) how would you like to retrieve & update data in pseudo code
you can also reach me on discord (hotkartoffel.eth#2160) or schedule a call if that makes it any easier
from embedbase.
@doanhieu9797 hey, just added an update endpoint, hope that's helping :)
https://docs.embedbase.xyz/interface#updating-data
from embedbase.
Related Issues (20)
- [Dashboard]: chat kept in state
- [Dashboard, Core, Python SDK, Javascript SDK]: publish appropriate changelog on release
- [Python SDK, Javascript SDK]: improve error handling HOT 1
- [Dashboard, Core, Python SDK, Javascript SDK]: integration unit tests concurrency issue HOT 1
- [Python SDK]: async combining two dataset doesn't work after python SDK update HOT 3
- [Core]: ValueError: Out of range float values are not JSON compliant HOT 3
- [Core]: add response model
- [Core]: tiktoken stackoverflow
- [Python SDK]: ImportError: cannot import name 'EmbedbaseAsyncClient' from 'embedbase_client' HOT 1
- [Python SDK]: Where implementation in postgres db HOT 1
- [Core]: YamlModel is a deprecated class in pydantic_yaml HOT 1
- merge docs apis into a single one using wasm hacks HOT 1
- Intermittent playground crash HOT 7
- Getting started examples
- async function call not working (uvicorn, fastapi) HOT 10
- ValueError: Out of range float values are not JSON compliant HOT 2
- Add references to github and pdf uploader
- metadata unicode issue with supabase
- Add .ratio to embedbase sdk HOT 3
- [Core/Hosted]: search feedback HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
š Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ššš
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ā¤ļø Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from embedbase.