Git Product home page Git Product logo

Comments (10)

louis030195 avatar louis030195 commented on September 16, 2024 3

Iā€™m using your embedbase library on python. I wonder how can I use the createContext api like on the embedbase.xyz playground menu. By the way embedbasd is very helpful to me. Thank you very much.

@doanhieu9797 thanks a lot!

Sure! Here's the "createContext" function from the JS SDK translated to Python:

import requests
import json

def create_context(embedbase_api_url, embedbase_key, dataset, query, options=None):
    if options is None:
        options = {}
    limit = options.get("limit", 5)
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {embedbase_key}"
    }
    search_url = f"{embedbase_api_url}/{dataset}/search"
    
    response = requests.post(search_url, headers=headers, data=json.dumps({"query": query, "top_k": limit}))
    data = response.json()

    return [similarity["data"] for similarity in data["similarities"]]

# Usage
embedbase_api_url = "https://your_embedbase_url/v1"
embedbase_key = "your_embedbase_key"
dataset = "your_dataset_name"
query = "your_query"

context_data = create_context(embedbase_api_url, embedbase_key, dataset, query, options={"limit": 5})
print(context_data)

Just replace "your_embedbase_url", "your_embedbase_key", "your_dataset_name", and "your_query" with the appropriate values.

FYI, we plan to open source the playground soon, so you can take a look at the code, and turn it into a one-line installable React component,
Can you tell me how this code helps you, please?

If you have prompt size issue, don't hesitate to follow-up, I can help with that part too

from embedbase.

doanhieu9797 avatar doanhieu9797 commented on September 16, 2024

Thank. Great if you open source the playground soon, I think I'll make an AI app to read and analyze my own docs.

from embedbase.

doanhieu9797 avatar doanhieu9797 commented on September 16, 2024

By the way can you please tell me if using splitText to shorten the context will affect chatgpt's answer?

from embedbase.

louis030195 avatar louis030195 commented on September 16, 2024

@doanhieu9797 cool! We actually created a documentation connected to GPT-4
This is the script that send all documentation files to Embedbase
https://github.com/different-ai/embedbase-docs/blob/main/scripts/sync.ts
at every git push on main branch https://github.com/different-ai/embedbase-docs/blob/main/.github/workflows/index.yaml

image

https://docs.embedbase.xyz/

from embedbase.

louis030195 avatar louis030195 commented on September 16, 2024

By the way can you please tell me if using splitText to shorten the context will affect chatgpt's answer?

We are experimenting with different way of splitting the text. The biggest value is to avoid going over the prompt size. How you split the text depends on the experience you want to create.

For example, if the user asks a question to a documentation "how can I run an eth node on raspberry pi?" it will search for similar information in the chosen dataset(s) and feed it to GPT. I recommend experimenting different splitting size and see what's best for you. We want to create an easy numerical way to evaluate different strategies in the future

from embedbase.

doanhieu9797 avatar doanhieu9797 commented on September 16, 2024

@louis030195 cool! Many thanks. I'm back after busy days, i am wondering how to update a data of documents imported into dataset_ids can you please let me know.

from embedbase.

ashgansh avatar ashgansh commented on September 16, 2024

@doanhieu9797

we don't support updating data at the moment - we're append only.

our recommendation is to create a new dataset and later then query only the new dataset with the updated data. under the hood, we make sure that re-creating datasets is performant and efficient.

PS: this is because embeddings are rarely retrieved using an ID contrary to SQL and NoSQL DBs, we believe it doesn't make sense to retrieve a single embedding through an id and update it.

PPS: we understand this can be a bit of hassle, and are open to implement this in the future if we hear a compelling use case.

from embedbase.

doanhieu9797 avatar doanhieu9797 commented on September 16, 2024

@doanhieu9797

we don't support updating data at the moment - we're append only.

our recommendation is to create a new dataset and later then query only the new dataset with the updated data. under the hood, we make sure that re-creating datasets is performant and efficient.

PS: this is because embeddings are rarely retrieved using an ID contrary to SQL and NoSQL DBs, we believe it doesn't make sense to retrieve a single embedding through an id and update it.

PPS: we understand this can be a bit of hassle, and are open to implement this in the future if we hear a compelling use case.

@hotkartoffel I know but this is very necessary because when I import a lot of data into the dataset and I only want to edit one data, I have to delete the whole dataset and re-import it. It's not reasonable at all.

from embedbase.

ashgansh avatar ashgansh commented on September 16, 2024

@doanhieu9797 we're ready to update our beliefs there.

just a few questions:
a) could you expand a bit about your use of the api (what are you storing, how do you use it, would you mind sharing a sample entry that you store?)
b) how would you like to retrieve & update data in pseudo code

you can also reach me on discord (hotkartoffel.eth#2160) or schedule a call if that makes it any easier

from embedbase.

louis030195 avatar louis030195 commented on September 16, 2024

@doanhieu9797 hey, just added an update endpoint, hope that's helping :)

https://docs.embedbase.xyz/interface#updating-data

from embedbase.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.