qdrant / qdrant-client Goto Github PK

View Code? Open in Web Editor NEW

622.0 6.0 98.0 2.68 MB

Python client for Qdrant vector search engine

Home Page: https://qdrant.tech

License: Apache License 2.0

Python 99.60% Shell 0.40%

qdrant vector-database vector-search vector-search-engine

qdrant-client's Introduction

Python Client library for the Qdrant vector search engine.

Python Qdrant Client

Client library and SDK for the Qdrant vector search engine. Python Client API Documentation is available here.

Library contains type definitions for all Qdrant API and allows to make both Sync and Async requests.

Client allows calls for all Qdrant API methods directly. It also provides some additional helper methods for frequently required operations, e.g. initial collection uploading.

See QuickStart for more details!

Installation

pip install qdrant-client

Features

Type hints for all API methods
Local mode - use same API without running server
REST and gRPC support
Minimal dependencies
Extensive Test Coverage

Local mode

Python client allows you to run same code in local mode without running Qdrant server.

Simply initialize client like this:

from qdrant_client import QdrantClient

client = QdrantClient(":memory:")
# or
client = QdrantClient(path="path/to/db")  # Persists changes to disk

Local mode is useful for development, prototyping and testing.

You can use it to run tests in your CI/CD pipeline.
Run it in Colab or Jupyter Notebook, no extra dependencies required. See an example
When you need to scale, simply switch to server mode.

Fast Embeddings + Simpler API

pip install qdrant-client[fastembed]

FastEmbed is a library for creating fast vector embeddings on CPU. It is based on ONNX Runtime and allows to run inference on CPU with GPU-like performance.

Qdrant Client can use FastEmbed to create embeddings and upload them to Qdrant. This allows to simplify API and make it more intuitive.

from qdrant_client import QdrantClient

# Initialize the client
client = QdrantClient(":memory:")  # or QdrantClient(path="path/to/db")

# Prepare your documents, metadata, and IDs
docs = ["Qdrant has Langchain integrations", "Qdrant also has Llama Index integrations"]
metadata = [
    {"source": "Langchain-docs"},
    {"source": "Linkedin-docs"},
]
ids = [42, 2]

# Use the new add method
client.add(
    collection_name="demo_collection",
    documents=docs,
    metadata=metadata,
    ids=ids
)

search_result = client.query(
    collection_name="demo_collection",
    query_text="This is a query document"
)
print(search_result)

Connect to Qdrant server

To connect to Qdrant server, simply specify host and port:

from qdrant_client import QdrantClient

client = QdrantClient(host="localhost", port=6333)
# or
client = QdrantClient(url="http://localhost:6333")

You can run Qdrant server locally with docker:

docker run -p 6333:6333 qdrant/qdrant:latest

See more launch options in Qdrant repository.

Connect to Qdrant cloud

You can register and use Qdrant Cloud to get a free tier account with 1GB RAM.

Once you have your cluster and API key, you can connect to it like this:

from qdrant_client import QdrantClient

qdrant_client = QdrantClient(
    url="https://xxxxxx-xxxxx-xxxxx-xxxx-xxxxxxxxx.us-east.aws.cloud.qdrant.io:6333",
    api_key="<your-api-key>",
)

Examples

Create a new collection

from qdrant_client.models import Distance, VectorParams

client.recreate_collection(
    collection_name="my_collection",
    vectors_config=VectorParams(size=100, distance=Distance.COSINE),
)

Insert vectors into a collection

import numpy as np
from qdrant_client.models import PointStruct

vectors = np.random.rand(100, 100)
# NOTE: consider splitting the data into chunks to avoid hitting the server's payload size limit
# or use `upload_collection` or `upload_points` methods which handle this for you
# WARNING: uploading points one-by-one is not recommended due to requests overhead
client.upsert(
    collection_name="my_collection",
    points=[
        PointStruct(
            id=idx,
            vector=vector.tolist(),
            payload={"color": "red", "rand_number": idx % 10}
        )
        for idx, vector in enumerate(vectors)
    ]
)

Search for similar vectors

query_vector = np.random.rand(100)
hits = client.search(
    collection_name="my_collection",
    query_vector=query_vector,
    limit=5  # Return 5 closest points
)

Search for similar vectors with filtering condition

from qdrant_client.models import Filter, FieldCondition, Range

hits = client.search(
    collection_name="my_collection",
    query_vector=query_vector,
    query_filter=Filter(
        must=[  # These conditions are required for search results
            FieldCondition(
                key='rand_number',  # Condition based on values of `rand_number` field.
                range=Range(
                    gte=3  # Select only those results where `rand_number` >= 3
                )
            )
        ]
    ),
    limit=5  # Return 5 closest points
)

See more examples in our Documentation!

gRPC

To enable (typically, much faster) collection uploading with gRPC, use the following initialization:

from qdrant_client import QdrantClient

client = QdrantClient(host="localhost", grpc_port=6334, prefer_grpc=True)

Async client

Starting from version 1.6.1, all python client methods are available in async version.

To use it, just import AsyncQdrantClient instead of QdrantClient:

from qdrant_client import AsyncQdrantClient, models
import numpy as np
import asyncio

async def main():
    # Your async code using QdrantClient might be put here
    client = AsyncQdrantClient(url="http://localhost:6333")

    await client.create_collection(
        collection_name="my_collection",
        vectors_config=models.VectorParams(size=10, distance=models.Distance.COSINE),
    )

    await client.upsert(
        collection_name="my_collection",
        points=[
            models.PointStruct(
                id=i,
                vector=np.random.rand(10).tolist(),
            )
            for i in range(100)
        ],
    )

    res = await client.search(
        collection_name="my_collection",
        query_vector=np.random.rand(10).tolist(),  # type: ignore
        limit=10,
    )

    print(res)

asyncio.run(main())

Both, gRPC and REST API are supported in async mode. More examples can be found here.

Development

This project uses git hooks to run code formatters.

Install pre-commit with pip3 install pre-commit and set up hooks with pre-commit install.

pre-commit requires python>=3.8

qdrant-client's People

Contributors

Stargazers

Watchers

Forkers

moscicky sgaseretto arita37 nnneznaika b0r3y cjrh deanofthewebb imneonizer luca-mazzola pablojmoreno fursovia maralzar minghao2016 inarix martyanov emg110 gabrielmbmb nleroy917 entn-at hridaym25 ibrahim-akrab nairajay2k sebastianmarkow jsph-lm kelvinlin avsolatorio running-chen911 edwardpwtsoi squat guillaumebu vascokk vitalii-vovk nnao45 patrickdeluca zzzz-vincent chienlady ehsanmok arthurmelin mr777xx prrao87 kshivendu muneebaadil arronkler neelaksh-singh vziy98 coactive-tomas jdb78 zolekode techthiyanes tekumara hezhaozhao-git 5l1v3r1 chan150 alimohammad1995 nsuleerturk cynicalwilson copyrightsworld kaqumiq xorsuyash bayareaunicorn hyakushiki-tuo denis-angilella chriskr7 mbiermans cienciadedadosebigdata skvark victor1314 buildaiapp praveen-palanisamy shivas1516 lingcoder boksidev ashiskumarnaik sorokinvld tjbai mjul shankariraja samirps geetu040 tahslim simoncw rabad franckzibi gogog01-29-2021 rishabgit almostimplemented siri1410 acuere alessandromondin ya-shh mach-12 yasyf zhwdzh apmats puyuanot jamiefifty third-party-collaboration

qdrant-client's Issues

Container shutsdown automatically

Hi, I installed qdrant on new os Ubuntu 18.04, and when I ran my previous code of searching, it starts giving me error.

As soon as I click on the execute on FastAPI docs, the container automatically shutdown. Overall output of container is

(base) ahmad@ahmad:~$ sudo docker run -p 6333:6333   -v $(pwd)/qdrant_storage:/qdrant/storage   generall/qdrant
[sudo] password for ahmad: 
[2021-10-05T17:48:27Z INFO  wal::segment] Segment { path: "./storage/collections/reddit_videos/wal/open-2", entries: 0, space: (8/33554432) }: opened
[2021-10-05T17:48:27Z INFO  wal::segment] Segment { path: "./storage/collections/reddit_videos/wal/open-1", entries: 5, space: (824832/33554432) }: opened
[2021-10-05T17:48:27Z INFO  wal] Wal { path: "./storage/collections/reddit_videos/wal", segment-count: 1, entries: [0, 5)  }: opened
[2021-10-05T17:48:28Z INFO  qdrant] loaded collection: reddit_videos
[2021-10-05T17:48:28Z INFO  actix_server::builder] Starting 8 workers
[2021-10-05T17:48:28Z INFO  actix_server::builder] Starting "actix-web-service-0.0.0.0:6333" service on 0.0.0.0:6333

Error Message is

500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/uvicorn/protocols/http/h11_impl.py", line 373, in run_asgi
    result = await app(self.scope, self.receive, self.send)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/uvicorn/middleware/proxy_headers.py", line 75, in __call__
    return await self.app(scope, receive, send)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/fastapi/applications.py", line 208, in __call__
    await super().__call__(scope, receive, send)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/starlette/applications.py", line 112, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/starlette/middleware/errors.py", line 181, in __call__
    raise exc from None
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/starlette/middleware/errors.py", line 159, in __call__
    await self.app(scope, receive, _send)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/starlette/exceptions.py", line 82, in __call__
    raise exc from None
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/starlette/exceptions.py", line 71, in __call__
    await self.app(scope, receive, sender)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/starlette/routing.py", line 580, in __call__
    await route.handle(scope, receive, send)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/starlette/routing.py", line 241, in handle
    await self.app(scope, receive, send)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/starlette/routing.py", line 52, in app
    response = await func(request)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/fastapi/routing.py", line 227, in app
    dependant=dependant, values=values, is_coroutine=is_coroutine
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/fastapi/routing.py", line 161, in run_endpoint_function
    return await run_in_threadpool(dependant.call, **values)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/starlette/concurrency.py", line 40, in run_in_threadpool
    return await loop.run_in_executor(None, func, *args)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/ahmad/Desktop/WortelAI/ElasticSearch/scripts/Qdrant_organized/QdrantServer/api.py", line 15, in search
    res = searcher.search(pos=text, neg=neg_text, subreddit=subreddit)
  File "/home/ahmad/Desktop/WortelAI/ElasticSearch/scripts/Qdrant_organized/QdrantServer/search.py", line 29, in search
    top=5,
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/qdrant_client/qdrant_client.py", line 224, in search
    params=search_params
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/qdrant_openapi_client/api/points_api.py", line 395, in search_points
    search_request=search_request,
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/qdrant_openapi_client/api/points_api.py", line 248, in _build_for_search_points
    json=body,
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/qdrant_openapi_client/api_client.py", line 59, in request
    return self.send(request, type_)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/qdrant_openapi_client/api_client.py", line 76, in send
    response = self.middleware(request, self.send_inner)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/qdrant_openapi_client/api_client.py", line 179, in __call__
    return call_next(request)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/qdrant_openapi_client/api_client.py", line 88, in send_inner
    raise ResponseHandlingException(e)
qdrant_openapi_client.exceptions.ResponseHandlingException: can't handle event type ConnectionClosed when role=SERVER and state=SEND_RESPONSE

It does not even work on simple search or anything else. What can be the potential issue and fix.

Connection Error: Cannot assign requested address

I am using langchain to test out qdrant. Langchain has a from_texts method shown here which makes the qdrant client connection and then tries to recreate a collection with client.recreate_collection. I stepped through the code and the initial connection works fine, but for some reason recreate_collection gets a "Cannot assign requested address" error. It doesn't appear the langchain code is wrong based on the qdrant examples I've seen.

I am running qdrant via docker compose, exposing port 6333.

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/httpcore/_exceptions.py", line 10, in map_exceptions
    yield
  File "/usr/local/lib/python3.9/site-packages/httpcore/backends/sync.py", line 94, in connect_tcp
    sock = socket.create_connection(
  File "/usr/local/lib/python3.9/socket.py", line 844, in create_connection
    raise err
  File "/usr/local/lib/python3.9/socket.py", line 832, in create_connection
    sock.connect(sa)
OSError: [Errno 99] Cannot assign requested address

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/httpx/_transports/default.py", line 60, in map_httpcore_exceptions
    yield
  File "/usr/local/lib/python3.9/site-packages/httpx/_transports/default.py", line 218, in handle_request
    resp = self._pool.handle_request(req)
  File "/usr/local/lib/python3.9/site-packages/httpcore/_sync/connection_pool.py", line 253, in handle_request
    raise exc
  File "/usr/local/lib/python3.9/site-packages/httpcore/_sync/connection_pool.py", line 237, in handle_request
    response = connection.handle_request(request)
  File "/usr/local/lib/python3.9/site-packages/httpcore/_sync/connection.py", line 86, in handle_request
    raise exc
  File "/usr/local/lib/python3.9/site-packages/httpcore/_sync/connection.py", line 63, in handle_request
    stream = self._connect(request)
  File "/usr/local/lib/python3.9/site-packages/httpcore/_sync/connection.py", line 111, in _connect
    stream = self._network_backend.connect_tcp(**kwargs)
  File "/usr/local/lib/python3.9/site-packages/httpcore/backends/sync.py", line 94, in connect_tcp
    sock = socket.create_connection(
  File "/usr/local/lib/python3.9/contextlib.py", line 137, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/usr/local/lib/python3.9/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc)
httpcore.ConnectError: [Errno 99] Cannot assign requested address

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/qdrant_client/http/api_client.py", line 95, in send_inner
    response = self._client.send(request)
  File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 908, in send
    response = self._send_handling_auth(
  File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 936, in _send_handling_auth
    response = self._send_handling_redirects(
  File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 973, in _send_handling_redirects
    response = self._send_single_request(request)
  File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1009, in _send_single_request
    response = transport.handle_request(request)
  File "/usr/local/lib/python3.9/site-packages/httpx/_transports/default.py", line 218, in handle_request
    resp = self._pool.handle_request(req)
  File "/usr/local/lib/python3.9/contextlib.py", line 137, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/usr/local/lib/python3.9/site-packages/httpx/_transports/default.py", line 77, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ConnectError: [Errno 99] Cannot assign requested address

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/bhl_nlp/embeddings.py", line 51, in <module>
    qdrant = Qdrant.from_texts(
  File "/usr/local/lib/python3.9/site-packages/langchain/vectorstores/qdrant.py", line 190, in from_texts
    client.recreate_collection(
  File "/usr/local/lib/python3.9/site-packages/qdrant_client/qdrant_client.py", line 1178, in recreate_collection
    self.delete_collection(collection_name)
  File "/usr/local/lib/python3.9/site-packages/qdrant_client/qdrant_client.py", line 1116, in delete_collection
    return self.http.collections_api.delete_collection(
  File "/usr/local/lib/python3.9/site-packages/qdrant_client/http/api/collections_api.py", line 658, in delete_collection
    return self._build_for_delete_collection(
  File "/usr/local/lib/python3.9/site-packages/qdrant_client/http/api/collections_api.py", line 264, in _build_for_delete_collection
    return self.api_client.request(
  File "/usr/local/lib/python3.9/site-packages/qdrant_client/http/api_client.py", line 68, in request
    return self.send(request, type_)
  File "/usr/local/lib/python3.9/site-packages/qdrant_client/http/api_client.py", line 85, in send
    response = self.middleware(request, self.send_inner)
  File "/usr/local/lib/python3.9/site-packages/qdrant_client/http/api_client.py", line 188, in __call__
    return call_next(request)
  File "/usr/local/lib/python3.9/site-packages/qdrant_client/http/api_client.py", line 97, in send_inner
    raise ResponseHandlingException(e)
qdrant_client.http.exceptions.ResponseHandlingException: [Errno 99] Cannot assign requested address

Document upload_collection

It would be nice to document this part of the API in the README, seems pretty essential to the whole flow:

https://github.com/qdrant/qdrant_client/blob/38ffdcb51ca7e2286285445aa86e347b2bec0ca0/tests/test_qdrant_client.py#L68-L75

I can make a PR if you'd like.

Cannot reference a Qdrant instance on Azure Web App - [Errno -2] Name or service not known

I've deployed the Qdrant service in an Azure Web App mapping the port 80 to the port 3666 the docker container expects.

If I query the service using simple get and post requests, all works fine. For example, running this:

import requests
r = requests.get('https://<myservice>.azurewebsites.net/collections')
r.json()

I get the following result:

{'result': {'collections': [{'name': 'test_collection'}]},
'status': 'ok',
'time': 3.6e-06}

Now, I've installed the Qdrant client (version 1.0.1). I tried the following:

from qdrant_client import QdrantClient
client = QdrantClient(host="https://<myservice>.azurewebsites.net/", port=80)
collection_info = client.get_collections()
collection_info

and I'm getting the following error:

gaierror Traceback (most recent call last)
File /anaconda/envs/pdfparser/lib/python3.9/site-packages/httpcore/_exceptions.py:10, in map_exceptions(map)
9 try:
---> 10 yield
11 except Exception as exc: # noqa: PIE786

File /anaconda/envs/pdfparser/lib/python3.9/site-packages/httpcore/backends/sync.py:94, in SyncBackend.connect_tcp(self, host, port, timeout, local_address)
93 with map_exceptions(exc_map):
---> 94 sock = socket.create_connection(
95 address, timeout, source_address=source_address
96 )
97 return SyncStream(sock)

File /anaconda/envs/pdfparser/lib/python3.9/socket.py:823, in create_connection(address, timeout, source_address)
822 err = None
--> 823 for res in getaddrinfo(host, port, 0, SOCK_STREAM):
824 af, socktype, proto, canonname, sa = res

File /anaconda/envs/pdfparser/lib/python3.9/socket.py:954, in getaddrinfo(host, port, family, type, proto, flags)
953 addrlist = []
--> 954 for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
955 af, socktype, proto, canonname, sa = res

gaierror: [Errno -2] Name or service not known

Implement a retry mechanism for initial upload

When upload of a batch fails, say, due to a network issue during the initial upload to a remote Qdrant server, it affectively cancels out the rest of the whole upload task. In this case, user need to either start the task from the scratch, or figure out the last uploaded point and resume from there. Both options are time-consuming and not developer-friendly. Instead, initial upload methods should be able to retry for a configurable number of attempts before exiting with an exception.

Create github actions, utils sub-module

Hello,

A couple of comments.

Through Github actions,
we can launch a docker with qdrant server and test the client in/out (ie integration tests)
can provide it through PR if you have interest.

It would be a useful to create a sub-module:
qdrant_client.contrib

So, community can PR into it and add utlities (like converion faiss --> Qdrant ).

Can provide the faiss converion util, if you have interest.

sys:1: RuntimeWarning: coroutine 'PointsStub.upsert' was never awaited

i try to run upsert in async function.

async def upsert(client, path):
    gen_filename(path)
    kind = filetype.guess(path)
    debug(kind.extension)
    debug(kind.mime)
    np = face_recognition.load_image_file(path)
    vector = face_recognition.face_encodings(np)[0]
    client.upsert(
        collection_name="pig",
        points=[
            models.PointStruct(
                id=1,
                payload={
                    "filename": path,
                },
                vector=vector.tolist(),
            ),
        ]
    )


async def main():
    load_dotenv()

    client = QdrantClient(host="localhost",
                          grpc_port=6334, prefer_grpc=True)

    await upsert(client, "images/1.jpg")

if __name__ == "__main__":
    asyncio.run(main())

it raise Exception

Traceback (most recent call last):
  File "/srv/project/face/main.py", line 61, in <module>
    asyncio.run(main())
  File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
    return future.result()
  File "/srv/project/face/main.py", line 57, in main
    await upsert(client, "images/1.jpg")
  File "/srv/project/face/main.py", line 37, in upsert
    client.upsert(
  File "/home/kula/.cache/pypoetry/virtualenvs/face-G3AFUNPz-py3.10/lib/python3.10/site-packages/qdrant_client/qdrant_client.py", line 566, in upsert
    asyncio.get_event_loop().run_until_complete(self._grpc_points_client.upsert(
  File "/usr/lib/python3.10/asyncio/base_events.py", line 622, in run_until_complete
    self._check_running()
  File "/usr/lib/python3.10/asyncio/base_events.py", line 582, in _check_running
    raise RuntimeError('This event loop is already running')
RuntimeError: This event loop is already running
sys:1: RuntimeWarning: coroutine 'PointsStub.upsert' was never awaited

does it compatible with async function?

qdrant-client = "^0.9.0"

Timeout in python client working but still returning timeout error

Versions

qdrant_client version: 11.1
qdrant version: 0.11.3
python: 3.9

Summary

I have been getting timeouts when creating collections. I passed in timeout=60 to the recreate_collection method and it still appears to timeout after about 10 seconds. See stack trace below. However, when I look at the logs in the cluster, I see it did appear to successfully pass the timeout parameter to the PUT request: [2022-11-21T20:01:05.375Z INFO actix_web::middleware::logger] 10.2.137.204 "PUT /collections/test_v01_32_16_10?timeout=60 HTTP/1.1" 200 72 "-" "python-httpx/0.23.0" 60.814669

Timeout error from python client

Traceback (most recent call last):
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/httpcore/_exceptions.py", line 8, in map_exceptions
    yield
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/httpcore/backends/sync.py", line 26, in read
    return self._sock.recv(max_bytes)
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/httpx/_transports/default.py", line 60, in map_httpcore_exceptions
    yield
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/httpx/_transports/default.py", line 218, in handle_request
    resp = self._pool.handle_request(req)
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/httpcore/_sync/connection_pool.py", line 253, in handle_request
    raise exc
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/httpcore/_sync/connection_pool.py", line 237, in handle_request
    response = connection.handle_request(request)
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/httpcore/_sync/connection.py", line 90, in handle_request
    return self._connection.handle_request(request)
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/httpcore/_sync/http11.py", line 105, in handle_request
    raise exc
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/httpcore/_sync/http11.py", line 84, in handle_request
    ) = self._receive_response_headers(**kwargs)
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/httpcore/_sync/http11.py", line 148, in _receive_response_headers
    event = self._receive_event(timeout=timeout)
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/httpcore/_sync/http11.py", line 177, in _receive_event
    data = self._network_stream.read(
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/httpcore/backends/sync.py", line 26, in read
    return self._sock.recv(max_bytes)
  File "/opt/homebrew/Cellar/[email protected]/3.9.13_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/contextlib.py", line 137, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/httpcore/_exceptions.py", line 12, in map_exceptions
    raise to_exc(exc)
httpcore.ReadTimeout: timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/qdrant_client/http/api_client.py", line 95, in send_inner
    response = self._client.send(request)
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/httpx/_client.py", line 902, in send
    response = self._send_handling_auth(
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/httpx/_client.py", line 930, in _send_handling_auth
    response = self._send_handling_redirects(
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/httpx/_client.py", line 967, in _send_handling_redirects
    response = self._send_single_request(request)
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/httpx/_client.py", line 1003, in _send_single_request
    response = transport.handle_request(request)
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/httpx/_transports/default.py", line 218, in handle_request
    resp = self._pool.handle_request(req)
  File "/opt/homebrew/Cellar/[email protected]/3.9.13_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/contextlib.py", line 137, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/httpx/_transports/default.py", line 77, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ReadTimeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/homebrew/Cellar/[email protected]/3.9.13_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/homebrew/Cellar/[email protected]/3.9.13_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/Users/zoestatman-weil/code/earth-index-ml/indexing/qdrant/tiles/create_collections_for_tiles.py", line 65, in <module>
    main(host=host,
  File "/Users/zoestatman-weil/code/earth-index-ml/indexing/qdrant/tiles/create_collections_for_tiles.py", line 25, in main
    client.recreate_collection(collection_name=qdrant_collection,
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/qdrant_client/qdrant_client.py", line 1191, in recreate_collection
    self.http.collections_api.create_collection(
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/qdrant_client/http/api/collections_api.py", line 618, in create_collection
    return self._build_for_create_collection(
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/qdrant_client/http/api/collections_api.py", line 193, in _build_for_create_collection
    return self.api_client.request(
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/qdrant_client/http/api_client.py", line 68, in request
    return self.send(request, type_)
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/qdrant_client/http/api_client.py", line 85, in send
    response = self.middleware(request, self.send_inner)
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/qdrant_client/http/api_client.py", line 188, in __call__
    return call_next(request)
  File "/Users/zoestatman-weil/code/earth-index-ml/venv/lib/python3.9/site-packages/qdrant_client/http/api_client.py", line 97, in send_inner
    raise ResponseHandlingException(e)
qdrant_client.http.exceptions.ResponseHandlingException: timed out

Add new points to an existing collection

I created a collection and added some points to it and I was able to search through that collection but when I am trying to add more points to an existing collection, that is not working.

My code:
qd_client.upload_collection( collection_name='startups', vectors=vectors1, payload=payload1, ids=ids1, batch_size=256 )

Now when I am trying to add more points,
qd_client.upload_collection( collection_name='startups', vectors=vectors2, payload=payload2, ids=ids2, batch_size=256 )

It is not working. can you please guide here quickly @generall @monatis

ModuleNotFoundError

from qdrant_client import QdrantClient

ModuleNotFoundError: No module named 'qdrant_client'

Installed via pip3 on python 3.7.3 venv

Make data upload parallel

Currently - the bottleneck of the data initialization process is on client side.
Python can't serialize index requests as fast as Qdrant can process them.
The first thing to try here is try to utilize multiple cores in client.

Data validation and `upload_collection` method

I think the upload_collection method should validate data before uploading it.

For example, it will silently upload 1 million records where vector_size=256 to a collection where vector_size=512.
As a result, you get no errors and no data will be added to a collection.

Comparing the vector size of a collection to the input array size is very cheap.

Vector is not being returned

Qdrant 0.5.1

vector is not being returned when getting points or performing search

Cannot filter using has_id, HasIdCondition model is not serialized correctly

According to https://qdrant.github.io/qdrant/redoc/index.html#operation/search_points, the filter field should be one of FieldCondition or HasIdCondition or Filter (recursively).

I'm unable to use HasIdCondition (other filters work fine). HasIdCondition does not seem to serialize correctly.

Example code:

from qdrant_openapi_client.models.models import (
    FieldCondition,
    Filter,
    HasIdCondition,
    Match,
)

Filter(
    must=[
        HasIdCondition(has_id=[42, 43]),
        FieldCondition(key="field_name", match=Match(keyword="field_value_42")),
    ]
).json()

Result:

{
  "must": [
    {
      "must": null,
      "must_not": null,
      "should": null
    },
    {
      "geo_bounding_box": null,
      "geo_radius": null,
      "key": "field_name",
      "match": {
        "integer": null,
        "keyword": "field_value_42"
      },
      "range": null
    }
  ],
  "must_not": null,
  "should": null
}

Note that the has_id is completely missing.

I would expect an output like this:

{
  "must": [
    {
      "has_id": [
        42,
        43
      ]
    },
    {
      "geo_bounding_box": null,
      "geo_radius": null,
      "key": "field_name",
      "match": {
        "integer": null,
        "keyword": "field_value_42"
      },
      "range": null
    }
  ],
  "must_not": null,
  "should": null
}

I use qdrant-client==0.3.11 and Python 3.9. I tried upgrading Pydantic, but the output was the same.

validation error for SearchRequest vector -> 0

I followed the neural search tutorial given on the website. I used CLIP by OpenAI for text embeddings. and uploaded them to qdrant via

self.qdrant_client.upload_collection(
           collection_name=self.collection_name,
           vectors=self.vectors,
           payload=self.payload,
           ids=ids,  # Vector ids will be assigned automatically if None
           batch_size=batch_size,
       )

vectors are my embeddings and there size is 1,768. Now When I want to make a search API, following the tutorial, this is my code.

def search(self, text: str):
        self.model.descriptions = text
        # Convert text query into vector
        self.model.get_text_tokens() # tokenized the text
        vector = self.model.get_text_embeddings().detach().cpu().numpy() # torch tensor to numpy

        assert type(vector) == np.ndarray, "Embeddings should be in numpy"

        assert vector.shape == (1, 768), "wrong shape"

        # Use `vector` for search for closest vectors in the collection
        search_result = self.qdrant_client.search(
            collection_name=self.collection_name,
            query_vector=vector,
            query_filter=None,  # We don't want any filters for now
            top=5,  # 5 the most closest results is enough
        )

        # `search_result` contains found vector ids with similarity scores along with the stored payload
        # In this function we are interested in payload only
        payloads = [payload for point, payload in search_result]
        return payloads

Now If I run this function, or call it via FastAPI, the code for FastAPI is

from fastapi import FastAPI

# That is the file where NeuralSearcher is stored
from search import NeuralSearcher

app = FastAPI()

collection_name = "my_collection"

searcher = NeuralSearcher(collection_name=collection_name)


@app.get("/api/search")
def search(text: str):
    res = searcher.search(text=text)
    return {"result": res}


if __name__ == "__main__":
    import uvicorn

    uvicorn.run(app, host="0.0.0.0", port=8000)

Now If I run the search function, or call it via FastAPI, I get the following error.

INFO:     127.0.0.1:34974 - "GET /api/search?text=messi HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/uvicorn/protocols/http/h11_impl.py", line 373, in run_asgi
    result = await app(self.scope, self.receive, self.send)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/uvicorn/middleware/proxy_headers.py", line 75, in __call__
    return await self.app(scope, receive, send)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/fastapi/applications.py", line 208, in __call__
    await super().__call__(scope, receive, send)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/starlette/applications.py", line 112, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/starlette/middleware/errors.py", line 181, in __call__
    raise exc from None
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/starlette/middleware/errors.py", line 159, in __call__
    await self.app(scope, receive, _send)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/starlette/exceptions.py", line 82, in __call__
    raise exc from None
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/starlette/exceptions.py", line 71, in __call__
    await self.app(scope, receive, sender)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/starlette/routing.py", line 580, in __call__
    await route.handle(scope, receive, send)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/starlette/routing.py", line 241, in handle
    await self.app(scope, receive, send)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/starlette/routing.py", line 52, in app
    response = await func(request)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/fastapi/routing.py", line 227, in app
    dependant=dependant, values=values, is_coroutine=is_coroutine
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/fastapi/routing.py", line 161, in run_endpoint_function
    return await run_in_threadpool(dependant.call, **values)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/starlette/concurrency.py", line 40, in run_in_threadpool
    return await loop.run_in_executor(None, func, *args)
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/ahmad/Desktop/WortelAI/ElasticSearch/scripts/VC_FINAL/QdrantServer/api.py", line 15, in search
    res = searcher.search(text=text)
  File "/home/ahmad/Desktop/WortelAI/ElasticSearch/scripts/VC_FINAL/QdrantServer/search.py", line 31, in search
    top=5,  # 5 the most closest results is enough
  File "/home/ahmad/anaconda3/envs/new_youtube/lib/python3.7/site-packages/qdrant_client/qdrant_client.py", line 224, in search
    params=search_params
  File "pydantic/main.py", line 406, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for SearchRequest
vector -> 0
  value is not a valid float (type=type_error.float)

I have no idea why is this error, or what does it means. Please help.

Inquire how to load 100 million data into one collection

Versions

qdrant_client version: 1.0.0
python: 3.9

Summary

Hello, there was a timeout error while loading the data into the collection. I'm going to load about 100 million data per collection. I want to know which function of qdrant client should be used to load 100 million data. Also, 100 million data can be loaded into one collection, right? Also, how much minimum storage capacity should be in order to load 100 million data into qdrant?

recommend method

Sorry if I am missing it but is there a recommended method implemented in the library?

client.upload_collection() API change suggestion

One of the latest and useful change to this method was the possibility to use an Iterable for vectors argument.

But there is an scenario where it is not possible to take full advantage of this. For example, suppose we have our vectors and payload stored on mongodb. We can create a generator based on a mongodb cursor to feed the vectors, but as the API is designed today, we have to use another generator (cursor) for the payload and even another one (cursor) if we want to have a custom id.

So we will be increasing mongodb server roundtrips 3X.
if we could call upload_collection() with just a parameter, not with 3 (vectors, payload and ids), for example called 'data', one generator (cursor) will be enough.

def data_generator():
    for doc in mongo_cursor:
	yield {
	    'vector': doc['vector'],
	    'payload': {
		'age': doc['age'],
		'name': doc['name'],
	    },
	    'id': doc['id']
	}


client.upload_collection(
    collection_name=COLLECTION_NAME,
    data=data_generator(),
    parallel=4
)

expose settings for channel-level compression

Generate better names for anyOf definitions

Right now auto-generated OpenAPI structures looks like this:

StorageOperations = Union[
    StorageOperationsAnyOf,
    StorageOperationsAnyOf1,
    StorageOperationsAnyOf2,
    StorageOperationsAnyOf3,

    def get_collections(
        self,
    ) -> m.InlineResponse200:

Better names should be used.

Time out error, mounted volue

System
Windows 11

The problem
Once I mount the volume to qdrant/storage from time to time I get time out error for recreate_collection function. The error occures randomly, and independently of whether I set the timeout parameter to None or 30.

The error occurs from time to time, but not always. Interestingly, the error occurs much faster than after the 30 seconds I pass as a parameter. Furthermore, the collection is initialized, and I can add new entries to it, even if I get a timeout error while creating it (in a previous run of the script).

Code

from qdrant_client import QdrantClient
from qdrant_client.http.models import Distance, VectorPara

client = QdrantClient(host="localhost", port=6333)

name = "my_collection"

client.recreate_collection(
    collection_name=name,
    vectors_config=VectorParams(size=3, distance=Distance.COSINE),
    timeout=None
)

version: "3.9"
services:
  qdrant:
    image: qdrant/qdrant:v0.11.7
    ports:
      - "6333:6333"
      - "6334:6334"
    restart: always
    volumes:
      - ./qdrant_storage:/qdrant/storage

How to disable and re-enable indexing for a collection?

I have a lot of vectors (around 5 million) that I need to upload and since I will be uploading them all in one go, I was thinking disabling the index would make the process faster (according to the docs).

However, I am a bit confused as to how to use the python client to disable the index. In the official docs , as an example for python, the following is given:

client.update_collection(
    collection_name="{collection_name}",
    optimizer_config=models.OptimizersConfigDiff(
        max_segment_size=10000
    )
)

Would the value of max_segment_size need to be set to 0 inorder to disable the index? Also, weirdly for the example using http, instead of max_segment_size, indexing_threshold is set to 10000 in the same example. So what's the correct way to disable the index?

Error when not specifying arguments in upload_collection

I get TypeError: upload_collection() missing 2 required positional arguments: 'payload' and 'ids' when I do:

client.upload_collection(
    collection_name="personal_test",
    vectors=embs
)

Aren't these arguments optional?

Not able to install qdrant client

Command : pip install qdrant-client

ERROR: Could not find a version that satisfies the requirement qdrant-client (from versions: none)
ERROR: No matching distribution found for qdrant-client

can't parallelize upload_records() on airflow dag

Hi, qdrant_client v0.9.5, i am able to use upload_records() method without problems with parallel=4 argument. It does work even inside jupyter, but in airflow when i run a dag with a task of type PythonVirtualenvOperator using client.upload_records(), if i set parallel=4 i get the next error, the only way to run it inside airflow is to set parallel=1, but it takes a lot of time to run of course.

[2022-08-27 01:23:52,077] {process_utils.py:143} INFO - Traceback (most recent call last):
[2022-08-27 01:23:52,078] {process_utils.py:143} INFO -   File "/usr/local/lib/python3.8/multiprocessing/forkserver.py", line 280, in main
[2022-08-27 01:23:52,078] {process_utils.py:143} INFO -     code = _serve_one(child_r, fds,
[2022-08-27 01:23:52,078] {process_utils.py:143} INFO -   File "/usr/local/lib/python3.8/multiprocessing/forkserver.py", line 319, in _serve_one
[2022-08-27 01:23:52,078] {process_utils.py:143} INFO -     code = spawn._main(child_r, parent_sentinel)
[2022-08-27 01:23:52,078] {process_utils.py:143} INFO -   File "/usr/local/lib/python3.8/multiprocessing/spawn.py", line 125, in _main
[2022-08-27 01:23:52,078] {process_utils.py:143} INFO -     prepare(preparation_data)
[2022-08-27 01:23:52,079] {process_utils.py:143} INFO -   File "/usr/local/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare
[2022-08-27 01:23:52,079] {process_utils.py:143} INFO -     _fixup_main_from_path(data['init_main_from_path'])
[2022-08-27 01:23:52,079] {process_utils.py:143} INFO -   File "/usr/local/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
[2022-08-27 01:23:52,079] {process_utils.py:143} INFO -     main_content = runpy.run_path(main_path,
[2022-08-27 01:23:52,079] {process_utils.py:143} INFO -   File "/usr/local/lib/python3.8/runpy.py", line 265, in run_path
[2022-08-27 01:23:52,079] {process_utils.py:143} INFO -     return _run_module_code(code, init_globals, run_name,
[2022-08-27 01:23:52,079] {process_utils.py:143} INFO -   File "/usr/local/lib/python3.8/runpy.py", line 97, in _run_module_code
[2022-08-27 01:23:52,079] {process_utils.py:143} INFO -     _run_code(code, mod_globals, init_globals,
[2022-08-27 01:23:52,080] {process_utils.py:143} INFO -   File "/usr/local/lib/python3.8/runpy.py", line 87, in _run_code
[2022-08-27 01:23:52,080] {process_utils.py:143} INFO -     exec(code, run_globals)
[2022-08-27 01:23:52,080] {process_utils.py:143} INFO -   File "/tmp/venvvgl4j5dw/script.py", line 234, in <module>
[2022-08-27 01:23:52,080] {process_utils.py:143} INFO -     res = func_upload_records(*arg_dict["args"], **arg_dict["kwargs"])
[2022-08-27 01:23:52,080] {process_utils.py:143} INFO -   File "/tmp/venvvgl4j5dw/script.py", line 222, in func_upload_records
[2022-08-27 01:23:52,080] {process_utils.py:143} INFO -     qdrant_conn.upload_records(
[2022-08-27 01:23:52,080] {process_utils.py:143} INFO -   File "/tmp/venvvgl4j5dw/lib/python3.8/site-packages/qdrant_client/qdrant_client.py", line 1133, in upload_records
[2022-08-27 01:23:52,080] {process_utils.py:143} INFO -     self._upload_collection(batches_iterator, collection_name, parallel)
[2022-08-27 01:23:52,081] {process_utils.py:143} INFO -   File "/tmp/venvvgl4j5dw/lib/python3.8/site-packages/qdrant_client/qdrant_client.py", line 1110, in _upload_collection
[2022-08-27 01:23:52,081] {process_utils.py:143} INFO -     for _ in pool.unordered_map(batches_iterator, **updater_kwargs):
[2022-08-27 01:23:52,081] {process_utils.py:143} INFO -   File "/tmp/venvvgl4j5dw/lib/python3.8/site-packages/qdrant_client/parallel_processor.py", line 117, in unordered_map
[2022-08-27 01:23:52,081] {process_utils.py:143} INFO -     self.start(**kwargs)
[2022-08-27 01:23:52,081] {process_utils.py:143} INFO -   File "/tmp/venvvgl4j5dw/lib/python3.8/site-packages/qdrant_client/parallel_processor.py", line 112, in start
[2022-08-27 01:23:52,081] {process_utils.py:143} INFO -     process.start()
[2022-08-27 01:23:52,081] {process_utils.py:143} INFO -   File "/usr/local/lib/python3.8/multiprocessing/process.py", line 121, in start
[2022-08-27 01:23:52,081] {process_utils.py:143} INFO -     self._popen = self._Popen(self)
[2022-08-27 01:23:52,082] {process_utils.py:143} INFO -   File "/usr/local/lib/python3.8/multiprocessing/context.py", line 291, in _Popen
[2022-08-27 01:23:52,082] {process_utils.py:143} INFO -     return Popen(process_obj)
[2022-08-27 01:23:52,082] {process_utils.py:143} INFO -   File "/usr/local/lib/python3.8/multiprocessing/popen_forkserver.py", line 35, in __init__
[2022-08-27 01:23:52,082] {process_utils.py:143} INFO -     super().__init__(process_obj)
[2022-08-27 01:23:52,082] {process_utils.py:143} INFO -   File "/usr/local/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
[2022-08-27 01:23:52,082] {process_utils.py:143} INFO -     self._launch(process_obj)
[2022-08-27 01:23:52,082] {process_utils.py:143} INFO -   File "/usr/local/lib/python3.8/multiprocessing/popen_forkserver.py", line 42, in _launch
[2022-08-27 01:23:52,082] {process_utils.py:143} INFO -     prep_data = spawn.get_preparation_data(process_obj._name)
[2022-08-27 01:23:52,083] {process_utils.py:143} INFO -   File "/usr/local/lib/python3.8/multiprocessing/spawn.py", line 154, in get_preparation_data
[2022-08-27 01:23:52,083] {process_utils.py:143} INFO -     _check_not_importing_main()
[2022-08-27 01:23:52,083] {process_utils.py:143} INFO -   File "/usr/local/lib/python3.8/multiprocessing/spawn.py", line 134, in _check_not_importing_main
[2022-08-27 01:23:52,083] {process_utils.py:143} INFO -     raise RuntimeError('''
[2022-08-27 01:23:52,083] {process_utils.py:143} INFO - RuntimeError:
[2022-08-27 01:23:52,083] {process_utils.py:143} INFO -         An attempt has been made to start a new process before the
[2022-08-27 01:23:52,083] {process_utils.py:143} INFO -         current process has finished its bootstrapping phase.
[2022-08-27 01:23:52,083] {process_utils.py:143} INFO - 
[2022-08-27 01:23:52,083] {process_utils.py:143} INFO -         This probably means that you are not using fork to start your
[2022-08-27 01:23:52,084] {process_utils.py:143} INFO -         child processes and you have forgotten to use the proper idiom
[2022-08-27 01:23:52,084] {process_utils.py:143} INFO -         in the main module:
[2022-08-27 01:23:52,084] {process_utils.py:143} INFO - 
[2022-08-27 01:23:52,084] {process_utils.py:143} INFO -             if __name__ == '__main__':
[2022-08-27 01:23:52,084] {process_utils.py:143} INFO -                 freeze_support()
[2022-08-27 01:23:52,084] {process_utils.py:143} INFO -                 ...
[2022-08-27 01:23:52,084] {process_utils.py:143} INFO - 
[2022-08-27 01:23:52,084] {process_utils.py:143} INFO -         The "freeze_support()" line can be omitted if the program
[2022-08-27 01:23:52,084] {process_utils.py:143} INFO -         is not going to be frozen to produce an executable.

Needs timeout setting

I see some api calls have timeouts as an argument... but not all. This probably needs a more general pattern of timeout handling.

I'm getting timeouts on "create_snapshot". Listing the shapshots after handling the exception shows the request was successful.

related to #79

No module named 'numpy.typing'

I experience this issue when I call

from qdrant_client import QdrantClient

numpy version = 1.19.5
qdrant-client = master branch, rev:c57cebf
python version = 3.7.4

ResponseHandlingException: timed out error while creating a collection using the python client.

I am getting this error while trying to create a collection.
used the same sample code in the Readme.

from qdrant_client import QdrantClient

client = QdrantClient(host="localhost", port=6333)

client.recreate_collection(
    collection_name="my_collection",
    vector_size=100
)

ReadTimeout                               Traceback (most recent call last)

/media/dingusagar/Data/python-envs/pytorch-env/lib/python3.7/site-packages/httpx/_exceptions.py in map_exceptions(mapping, **kwargs)
    325     try:
--> 326         yield
    327     except Exception as exc:

/media/dingusagar/Data/python-envs/pytorch-env/lib/python3.7/site-packages/httpx/_client.py in _send_single_request(self, request, timeout)
    865                 stream=request.stream,  # type: ignore
--> 866                 ext={"timeout": timeout.as_dict()},
    867             )

/media/dingusagar/Data/python-envs/pytorch-env/lib/python3.7/site-packages/httpcore/_sync/connection_pool.py in request(self, method, url, headers, stream, ext)
    218                 response = connection.request(
--> 219                     method, url, headers=headers, stream=stream, ext=ext
    220                 )

/media/dingusagar/Data/python-envs/pytorch-env/lib/python3.7/site-packages/httpcore/_sync/connection.py in request(self, method, url, headers, stream, ext)
    105         )
--> 106         return self.connection.request(method, url, headers, stream, ext)
    107 

/media/dingusagar/Data/python-envs/pytorch-env/lib/python3.7/site-packages/httpcore/_sync/http11.py in request(self, method, url, headers, stream, ext)
     71             headers,
---> 72         ) = self._receive_response(timeout)
     73         response_stream = IteratorByteStream(

/media/dingusagar/Data/python-envs/pytorch-env/lib/python3.7/site-packages/httpcore/_sync/http11.py in _receive_response(self, timeout)
    132         while True:
--> 133             event = self._receive_event(timeout)
    134             if isinstance(event, h11.Response):

/media/dingusagar/Data/python-envs/pytorch-env/lib/python3.7/site-packages/httpcore/_sync/http11.py in _receive_event(self, timeout)
    171             if event is h11.NEED_DATA:
--> 172                 data = self.socket.read(self.READ_NUM_BYTES, timeout)
    173                 self.h11_state.receive_data(data)

/media/dingusagar/Data/python-envs/pytorch-env/lib/python3.7/site-packages/httpcore/_backends/sync.py in read(self, n, timeout)
     61                 self.sock.settimeout(read_timeout)
---> 62                 return self.sock.recv(n)
     63 

/usr/lib/python3.7/contextlib.py in __exit__(self, type, value, traceback)
    129             try:
--> 130                 self.gen.throw(type, value, traceback)
    131             except StopIteration as exc:

/media/dingusagar/Data/python-envs/pytorch-env/lib/python3.7/site-packages/httpcore/_exceptions.py in map_exceptions(map)
     11             if isinstance(exc, from_exc):
---> 12                 raise to_exc(exc) from None
     13         raise

ReadTimeout: timed out


The above exception was the direct cause of the following exception:

ReadTimeout                               Traceback (most recent call last)

/media/dingusagar/Data/python-envs/pytorch-env/lib/python3.7/site-packages/qdrant_openapi_client/api_client.py in send_inner(self, request)
     85         try:
---> 86             response = self._client.send(request)
     87         except Exception as e:

/media/dingusagar/Data/python-envs/pytorch-env/lib/python3.7/site-packages/httpx/_client.py in send(self, request, stream, auth, allow_redirects, timeout)
    771             allow_redirects=allow_redirects,
--> 772             history=[],
    773         )

/media/dingusagar/Data/python-envs/pytorch-env/lib/python3.7/site-packages/httpx/_client.py in _send_handling_auth(self, request, auth, timeout, allow_redirects, history)
    808                 allow_redirects=allow_redirects,
--> 809                 history=history,
    810             )

/media/dingusagar/Data/python-envs/pytorch-env/lib/python3.7/site-packages/httpx/_client.py in _send_handling_redirects(self, request, timeout, allow_redirects, history)
    836 
--> 837             response = self._send_single_request(request, timeout)
    838             response.history = list(history)

/media/dingusagar/Data/python-envs/pytorch-env/lib/python3.7/site-packages/httpx/_client.py in _send_single_request(self, request, timeout)
    865                 stream=request.stream,  # type: ignore
--> 866                 ext={"timeout": timeout.as_dict()},
    867             )

/usr/lib/python3.7/contextlib.py in __exit__(self, type, value, traceback)
    129             try:
--> 130                 self.gen.throw(type, value, traceback)
    131             except StopIteration as exc:

/media/dingusagar/Data/python-envs/pytorch-env/lib/python3.7/site-packages/httpx/_exceptions.py in map_exceptions(mapping, **kwargs)
    342         message = str(exc)
--> 343         raise mapped_exc(message, **kwargs) from exc  # type: ignore
    344 

ReadTimeout: timed out


During handling of the above exception, another exception occurred:

ResponseHandlingException                 Traceback (most recent call last)

/tmp/ipykernel_35131/2248584403.py in <module>
      2 client.recreate_collection(
      3     collection_name="my_collection",
----> 4     vector_size=384
      5 )
      6 

/media/dingusagar/Data/python-envs/pytorch-env/lib/python3.7/site-packages/qdrant_client/qdrant_client.py in recreate_collection(self, collection_name, vector_size, distance, hnsw_config, optimizers_config, wal_config)
    269                 hnsw_config=hnsw_config,
    270                 optimizers_config=optimizers_config,
--> 271                 wal_config=wal_config
    272             )
    273         )

/media/dingusagar/Data/python-envs/pytorch-env/lib/python3.7/site-packages/qdrant_openapi_client/api/collections_api.py in create_collection(self, name, create_collection)
    429         return self._build_for_create_collection(
    430             name=name,
--> 431             create_collection=create_collection,
    432         )
    433 

/media/dingusagar/Data/python-envs/pytorch-env/lib/python3.7/site-packages/qdrant_openapi_client/api/collections_api.py in _build_for_create_collection(self, name, create_collection)
    168 
    169         return self.api_client.request(
--> 170             type_=m.InlineResponse2001, method="PUT", url="/collections/{name}", path_params=path_params, json=body
    171         )
    172 

/media/dingusagar/Data/python-envs/pytorch-env/lib/python3.7/site-packages/qdrant_openapi_client/api_client.py in request(self, type_, method, url, path_params, **kwargs)
     57         url = (self.host or "") + url.format(**path_params)
     58         request = Request(method, url, **kwargs)
---> 59         return self.send(request, type_)
     60 
     61     @overload

/media/dingusagar/Data/python-envs/pytorch-env/lib/python3.7/site-packages/qdrant_openapi_client/api_client.py in send(self, request, type_)
     74 
     75     def send(self, request: Request, type_: Type[T]) -> T:
---> 76         response = self.middleware(request, self.send_inner)
     77         if response.status_code in [200, 201]:
     78             try:

/media/dingusagar/Data/python-envs/pytorch-env/lib/python3.7/site-packages/qdrant_openapi_client/api_client.py in __call__(self, request, call_next)
    177 class BaseMiddleware:
    178     def __call__(self, request: Request, call_next: Send) -> Response:
--> 179         return call_next(request)
    180 
    181 

/media/dingusagar/Data/python-envs/pytorch-env/lib/python3.7/site-packages/qdrant_openapi_client/api_client.py in send_inner(self, request)
     86             response = self._client.send(request)
     87         except Exception as e:
---> 88             raise ResponseHandlingException(e)
     89         return response
     90 

ResponseHandlingException: timed out

client.http.points_api.get_points() is not returning vectors

qdrant_client.http.exceptions.ResponseHandlingException: [Errno -2] Name or service not known

Hello,

I have a local setup as follows:

database = QdrantClient(host=settings().HOST, port=6333, api_key=settings().QDRANT_API)

where HOST and QDRANT are defined in a dotenv file where HOST points to a QDRANT cloud instance 'https://.*.aws.cloud.qdrant.io'. This works great when initializing the db locally using fastAPI with:

uvicorn main:app --reload

As soon as I try deploying on fly.io I get the following error:

2023-02-21T05:19:03Z   [info]httpx.ConnectError: [Errno -2] Name or service not known
2023-02-21T05:19:03Z   [info]During handling of the above exception, another exception occurred:
2023-02-21T05:19:03Z   [info]Traceback (most recent call last):
2023-02-21T05:19:03Z   [info]  File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 671, in lifespan
2023-02-21T05:19:03Z   [info]    async with self.lifespan_context(app):
2023-02-21T05:19:03Z   [info]  File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 566, in __aenter__
2023-02-21T05:19:03Z   [info]    await self._router.startup()
2023-02-21T05:19:03Z   [info]  File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 648, in startup
2023-02-21T05:19:03Z   [info]    await handler()
2023-02-21T05:19:03Z   [info]  File "/code/./app/main.py", line 120, in startup_event
2023-02-21T05:19:03Z   [info]    database.recreate_collection(
2023-02-21T05:19:03Z   [info]  File "/usr/local/lib/python3.9/site-packages/qdrant_client/qdrant_client.py", line 1609, in recreate_collection
2023-02-21T05:19:03Z   [info]    self.delete_collection(collection_name)
2023-02-21T05:19:03Z   [info]  File "/usr/local/lib/python3.9/site-packages/qdrant_client/qdrant_client.py", line 1543, in delete_collection
2023-02-21T05:19:03Z   [info]    result: Optional[bool] = self.http.collections_api.delete_collection(
2023-02-21T05:19:03Z   [info]  File "/usr/local/lib/python3.9/site-packages/qdrant_client/http/api/collections_api.py", line 788, in delete_collection
2023-02-21T05:19:03Z   [info]    return self._build_for_delete_collection(
2023-02-21T05:19:03Z   [info]  File "/usr/local/lib/python3.9/site-packages/qdrant_client/http/api/collections_api.py", line 268, in _build_for_delete_collection
2023-02-21T05:19:03Z   [info]    return self.api_client.request(
2023-02-21T05:19:03Z   [info]  File "/usr/local/lib/python3.9/site-packages/qdrant_client/http/api_client.py", line 68, in request
2023-02-21T05:19:03Z   [info]    return self.send(request, type_)
2023-02-21T05:19:03Z   [info]  File "/usr/local/lib/python3.9/site-packages/qdrant_client/http/api_client.py", line 85, in send
2023-02-21T05:19:03Z   [info]    response = self.middleware(request, self.send_inner)
2023-02-21T05:19:03Z   [info]  File "/usr/local/lib/python3.9/site-packages/qdrant_client/http/api_client.py", line 188, in __call__
2023-02-21T05:19:03Z   [info]    return call_next(request)
2023-02-21T05:19:03Z   [info]  File "/usr/local/lib/python3.9/site-packages/qdrant_client/http/api_client.py", line 97, in send_inner
2023-02-21T05:19:03Z   [info]    raise ResponseHandlingException(e)
2023-02-21T05:19:03Z   [info]qdrant_client.http.exceptions.ResponseHandlingException: [Errno -2] Name or service not known
2023-02-21T05:19:03Z   [info]ERROR:    Application startup failed. Exiting.
2023-02-21T05:19:04Z   [info]Starting clean up.

Any help would be greatly appreciated!

Batch gives validation errors for `payloads`

qdrant_client.upsert(
    collection_name=collection_name,
    points=Batch(
        ids=[1],
        vectors=[dummy_embed(doc) for doc in docs],
        payloads=[{"good": "yes", "hello": "world"}]
    ),
)

See the above, when I ran the above, I get this error:

{\"result\":null,\"status\":{\"error\":\"Json deserialize error: invalid type: map, expected a sequence at line 1 column 7730\"},\"time\":0.0}

Not really sure how what it means. I checked the source code, it seems like an array of maps is expected for payloads field.

Weirdly enough, if I change payloads to having a map of only one key-value pair, I would get a different error from FastAPI.

payloads=[{"good": "yes"}]

Error:

 validation error for Batch\npayloads\n  value is not a valid dict (type=type_error.dict)

Qdrant throws 'NoneType' object has no attribute 'CLOSED' when program exits

Current Behavior

When the program exits, I see this error:

Exception ignored in: <function QdrantClient.__del__ at 0x7f6d7848e280>
Traceback (most recent call last):
  File "/anaconda3/lib/python3.8/site-packages/qdrant_client/qdrant_client.py", line 74, in __del__
  File "/anaconda3/lib/python3.8/site-packages/httpx/_client.py", line 1256, in close
  File "/anaconda3/lib/python3.8/site-packages/httpx/_transports/default.py", line 230, in close
  File "/anaconda3/lib/python3.8/site-packages/httpcore/_sync/connection_pool.py", line 307, in close
  File "/anaconda3/lib/python3.8/site-packages/httpcore/_sync/connection.py", line 159, in close
  File "/anaconda3/lib/python3.8/site-packages/httpcore/_sync/http11.py", line 216, in close
AttributeError: 'NoneType' object has no attribute 'CLOSED'

Steps to Reproduce

from qdrant_client import QdrantClient

client = QdrantClient(host="localhost", port=6333)
print(client.get_collections())

Expected Behavior

Either the program should exit without any exceptions or I should be able to catch it.

Possible Solution

Context (Environment)

I am using qdrant-client==0.9.0 on python 3.8
The quadrant server is run via docker compose

version: "3.8"

services:
  qdrant:
    image: qdrant/qdrant:v0.9.1
    container_name: qdrant_embedding_search
    ports:
      - 6333:6333
      - 6334:6334
    env_file:
      - ./.env
    volumes:
      - ./data:/qdrant/storage
      - ./config/qdrant.yaml:/qdrant/config/production.yaml

Detailed Description

It seems the problem is with the __del__ implementation when the program exists it tries to close the grpc and openapi connections but ends up throwing exception.

Possible Implementation

A workaround I found is to close the connections manually before exiting the program:

import time
import numpy as np
from qdrant_client import QdrantClient
from qdrant_client.http.models import Distance

client = QdrantClient(host="localhost", port=6333)
print(client.get_collections())

# Close connection manually
if client._grpc_channel is not None:
    client._grpc_channel.close()
client.openapi_client.client._client.close()

Please wrap the closing method inside try blocks and handle the exception or provide a .close() method for QdrantClient so the user can close it manually.

`.upload_collection()` does not work with named vectors

For the initial upload, QdrantClient.upload_collection() should also work with named vectors, just as .upload_records(). A union over the current possible types and their dictionary counterparts should be implemented.

[SUGGESTION] Apply grpc native options through QdrantClient

Hi team, I'm recently trying qdrant vectordb and using qdrant_client for it.

I would suggest some improvement about QdrantClient for grpc prefer mode.
How about make some way to apply grpc native options through QdrantClient (or env? something..)
(It seems grpc connection just generated by host with port now)
https://github.com/qdrant/qdrant_client/blob/f829ed9745a7c600c72df6239e6e3b0ed07e8356/qdrant_client/connection.py#L27-L29

grpc native options like this:

channel = grpc.insecure_channel(
    'localhost:50051',
    options=[
        ('grpc.max_send_message_length', MAX_MESSAGE_LENGTH),
        ('grpc.max_receive_message_length', MAX_MESSAGE_LENGTH),
    ],
)

Can not import RecommendRequest

When I import

from qdrant_client.qdrant_client import SearchRequest

and ctrl + click on it to see the source, it takes me to models.py file. if we see the imports in the qdrant_client.py file, the RecommendRequest from models.py is missing. There are the total imports.

from qdrant_openapi_client.models.models import PointOperationsAnyOf, PointInsertOperationsAnyOfBatch, PayloadInterface, \
    GeoPoint, Distance, PointInsertOperationsAnyOf, PointRequest, SearchRequest, Filter, SearchParams, \
    StorageOperationsAnyOf, \
    StorageOperationsAnyOfCreateCollection, FieldIndexOperationsAnyOf, \
    FieldIndexOperationsAnyOf1, PayloadInterfaceStrictAnyOf, PayloadInterfaceStrictAnyOf1, PayloadInterfaceStrictAnyOf2, \
    PayloadInterfaceStrictAnyOf3, HnswConfigDiff, OptimizersConfigDiff, WalConfigDiff, StorageOperationsAnyOf2

I want to use the RecommendRequest as it has the negative vector option which I am looking for.

Have a batch upload to collection method

Feature request here.

I currently have a use case where I'm batching together documents for encoding so that I can saturate the GPU and speed up the encoding step. But then I'm having to loop over all the documents and upload them individually one by one. It would be nice to just pass a list of documents into a method and have them be indexed all at once similar to recreate_collection() for an initial creation of a collection.

Cannot connect to Qdrant cloud instance

Overview

I'm unable to connect to a Qdrant instance, even though I'm following the instructions provided in the documentation.

Steps to reproduce

Launch qdrant instance
Create environment variables with host and API key
Run the following code:

import os

from qdrant_client import QdrantClient
from qdrant_client.http import models

host_name = os.getenv("QDRANT_HOST")
api_key = os.getenv("QDRANT_API_KEY")

client = QdrantClient(
    host=host_name, api_key=api_key
)

client.recreate_collection(
    collection_name="test-collection",
    vectors_config=models.VectorParams(size=100, distance=models.Distance.COSINE),
)

Observed Behavior

The code fails to run, and raises the following exception: ResponseHandlingException: 502 Bad Gateway. It also fails if I replace https by http or set prefer_grpc=True.

Expected Behavior

Connection is successful and I'm able to create a collection.

Notes

I installed qdrant-client using poetry and I'm suing a Mac with an M1 processor, and OS 13.2.

How to find bottom values

In search, we can do

qdrant_client.search(
            collection_name=self.collection_name,
            query_vector=vector[0],
            query_filter=None, 
            top=5,  # 5 the most closest results is enough
        )

How can we do

qdrant_client.search(
            collection_name=self.collection_name,
            query_vector=vector[0],
            query_filter=None,  # We don't want any filters for now
            bot=5,  # 5 the most farthest results 
        )

Like I want the 5 results which are least similar. I do not want to fetch all the results, and take the last 5 from the list. Is there any better way?

gRPC - Support for search with gRPC in `QdrantClient`

Methods will be refactored individually to support API calls over REST and gRPC clients uniformly. This issue is for refactoring of QdrantClient.search method --open separate issues if you are planning to apply the same refactoring to other methods.

Tasks

Define a union over objects from pydantic models and gRPC request objects in the method signature.
If type of the object specified is not compatible with the preferred client, handle conversion internally.
If it is not possible to get a dict from pydantic model and initialize a gRPC request object from it, implement a utility function for conversion.
[ ]

Python Client search docs

No (obviously) docs for the search by ID function for the python client, worth just having in the full example. Happy to add if they genuinely don't exist if someone can point me to the original code. + happy to implement if the code doesn't exist if someone can point me in the right direction.

Mandatory parameter "port" for client initialization

Overview

I was trying setup the client to connect to a host without specifying a port, but the URI is parsed incorrectly.

Description

The URI I was trying to enter looks like this:
https://vector_store.com

I tried to initialize the client using None as the parameter for port:

client = QdrantClient(host="vector_store.com", https=True, port=None)

But it didn't work cause the REST URI is incorrectly parsed:

client.rest_uri == "https://vector_store.com:None"

Proposed solution

Modify the URI generation logic to handle an empty port.

# Original
self.rest_uri = f"http{'s' if self._https else ''}://{host}:{port}{self._prefix if self._prefix is not None else ''}"

# Proposed modification
self.rest_uri = f"http{'s' if self._https else ''}://{host}{':' + str(port) if port is not None else ""}{self._prefix if self._prefix is not None else ''}"

Bulk Similarity Search

I'd like a way to do bulk similarity search so I can more efficiently query to see if a set of embeddings either already exists or already has a similar match greater than some threshold. I think if the search API supported querying for multiple embeddings it would allow me to build any other needed logic on top easily. It could just return multiple results back in the same order as the input query embeddings. Is something like this already doable? Thanks.

Safer collection creation

I have a use case where I dynamically need to create collections rather often, and then insert into existing collections.

Perhaps I missed but it seems that the client is missing a convenience method to create a collection only if it does not exist. If I recreate I seem to lose previously uploaded vectors. My options are

call client.get_collections and iterate over to check if the name is there
call client.get_collection(name) and check if that fails

Somehow both seem a bit risky, e.g. they may fail for reasons that are independent of the collection already existing. So could there be a safer way to create a collection if it does not already exist?

Json deserialize error: when used client.search()

`
def get_filter2(img_filerns):

filter_rn = lambda filern: FieldCondition(
            key='file_rn',  # Condition based on values of `rand_number` field.
            match=models.MatchValue(value=filern)
        )
    
type_filter = Filter(should = [filter_rn(filern) for filern in img_filerns])
return type_filter

results = client.search(collection_name="search_test", query_vector=vectors_q.flatten(), query_filter=filters, limit=2 ) print(results)

Traceback (most recent call last): File ".\qdrant_search.py", line 74, in <module> search_vectors() File ".\qdrant_search.py", line 69, in search_vectors limit=2 # 5 the most closest results is enough File "C:\Users\qdrant_client\qdrant_client.py", line 251, in search score_threshold=score_threshold File "C:\Users\qdrant_client\http\api\points_api.py", line 654, in search_points search_request=search_request, File "C:\Users\qdrant_client\http\api\points_api.py", line 344, in _build_for_search_points json=body, File "C:\Users\qdrant_client\http\api_client.py", line 62, in request return self.send(request, type_) File "C:\Users\qdrant_client\http\api_client.py", line 85, in send raise UnexpectedResponse.for_response(response) qdrant_client.http.exceptions.UnexpectedResponse: Unexpected Response: 422 (Unprocessable Entity) Raw response content: b'{"result":null,"status":{"error":"Json deserialize error: missing fieldtopat line 1 column 10879"},"time":0.0}'

Qdrant-client==0.8.4
Search used to work with qdrant-client==0.8.0 using older filter condition given below

`def get_filter(img_filerns):

filter_rn = lambda filern: {
        "key": "file_rn",  
        "match": { # This condition checks if payload field have requested value
            "keyword": filern
        }
    }

# Define a filter for cities
type_filter = Filter(**{
    "should": [filter_rn(filern) for filern in img_filerns]
})

return type_filter

Please let me know if anymore details are required

Thanks

ResponseHandlingException : timed out while inserting data into collection

The script that i have written :

import tensorflow_hub as hub
model = hub.load(os.getenv("USE_MODEL_PATH"))

class NeuralSearchEngine:
    
    def __init__(self, collection_name):
        self.collection_name = collection_name
    

    def load_training_data(self):
        q1_data = QuestionsEvaluationCriteria.getBycaseIdAndQuestionNumber(caseId= "desert_city", question_number= 1)
        return q1_data[0]['variationsList']
    
 
   
    def create_vectors(self, payload):
        vectors =[]
        for key in payload:
            variations = payload[key]
            for variation in variations:
                embedding = model([variation])[0]
                vectors.append(embedding.numpy())
        
        return vectors


    def create_collection(self):
        payload = self.load_training_data()
        vectors = self.create_vectors(payload)
        print(vectors.shape)
        qdrant_client.recreate_collection(collection_name=self.collection_name ,distance = "Cosine", vector_size=512)
        my_collection_info = qdrant_client.http.collections_api.get_collection("my_collection")
        print(my_collection_info.dict())
        qdrant_client.upload_collection(
            collection_name=self.collection_name,
            vectors = vectors,
            payload = payload,
            ids= None,
            batch_size = 8
        )



nse = NeuralSearchEngine(collection_name="desertcity")

nse.create_collection()

Below is the Error with traceback :

ResponseHandlingException: timed out

ReadTimeout Traceback (most recent call last)
~/anaconda3/envs/casesolving/lib/python3.7/site-packages/httpx/_exceptions.py in map_exceptions(mapping, **kwargs)
325 try:
--> 326 yield
327 except Exception as exc:

~/anaconda3/envs/casesolving/lib/python3.7/site-packages/httpx/_client.py in _send_single_request(self, request, timeout)
865 stream=request.stream, # type: ignore
--> 866 ext={"timeout": timeout.as_dict()},
867 )

~/anaconda3/envs/casesolving/lib/python3.7/site-packages/httpcore/_sync/connection_pool.py in request(self, method, url, headers, stream, ext)
218 response = connection.request(
--> 219 method, url, headers=headers, stream=stream, ext=ext
220 )

~/anaconda3/envs/casesolving/lib/python3.7/site-packages/httpcore/_sync/connection.py in request(self, method, url, headers, stream, ext)
105 )
--> 106 return self.connection.request(method, url, headers, stream, ext)
107

~/anaconda3/envs/casesolving/lib/python3.7/site-packages/httpcore/_sync/http11.py in request(self, method, url, headers, stream, ext)
71 headers,
---> 72 ) = self._receive_response(timeout)
73 response_stream = IteratorByteStream(

~/anaconda3/envs/casesolving/lib/python3.7/site-packages/httpcore/_sync/http11.py in _receive_response(self, timeout)
132 while True:
--> 133 event = self._receive_event(timeout)
134 if isinstance(event, h11.Response):

~/anaconda3/envs/casesolving/lib/python3.7/site-packages/httpcore/_sync/http11.py in _receive_event(self, timeout)
171 if event is h11.NEED_DATA:
--> 172 data = self.socket.read(self.READ_NUM_BYTES, timeout)
173 self.h11_state.receive_data(data)

~/anaconda3/envs/casesolving/lib/python3.7/site-packages/httpcore/_backends/sync.py in read(self, n, timeout)
61 self.sock.settimeout(read_timeout)
---> 62 return self.sock.recv(n)
63

~/anaconda3/envs/casesolving/lib/python3.7/contextlib.py in exit(self, type, value, traceback)
129 try:
--> 130 self.gen.throw(type, value, traceback)
131 except StopIteration as exc:

~/anaconda3/envs/casesolving/lib/python3.7/site-packages/httpcore/_exceptions.py in map_exceptions(map)
11 if isinstance(exc, from_exc):
---> 12 raise to_exc(exc) from None
13 raise

ReadTimeout: timed out

The above exception was the direct cause of the following exception:

ReadTimeout Traceback (most recent call last)
~/anaconda3/envs/casesolving/lib/python3.7/site-packages/qdrant_openapi_client/api_client.py in send_inner(self, request)
85 try:
---> 86 response = self._client.send(request)
87 except Exception as e:

~/anaconda3/envs/casesolving/lib/python3.7/site-packages/httpx/_client.py in send(self, request, stream, auth, allow_redirects, timeout)
771 allow_redirects=allow_redirects,
--> 772 history=[],
773 )

~/anaconda3/envs/casesolving/lib/python3.7/site-packages/httpx/_client.py in _send_handling_auth(self, request, auth, timeout, allow_redirects, history)
808 allow_redirects=allow_redirects,
--> 809 history=history,
810 )

~/anaconda3/envs/casesolving/lib/python3.7/site-packages/httpx/_client.py in _send_handling_redirects(self, request, timeout, allow_redirects, history)
836
--> 837 response = self._send_single_request(request, timeout)
838 response.history = list(history)

~/anaconda3/envs/casesolving/lib/python3.7/contextlib.py in exit(self, type, value, traceback)
129 try:
--> 130 self.gen.throw(type, value, traceback)
131 except StopIteration as exc:

~/anaconda3/envs/casesolving/lib/python3.7/site-packages/httpx/_exceptions.py in map_exceptions(mapping, **kwargs)
342 message = str(exc)
--> 343 raise mapped_exc(message, **kwargs) from exc # type: ignore
344

ReadTimeout: timed out

During handling of the above exception, another exception occurred:

ResponseHandlingException Traceback (most recent call last)
in
----> 1 nse.create_collection()

in create_collection(self)
24 payload = self.load_training_data()
25 vectors = self.create_vectors(payload)
---> 26 qdrant_client.recreate_collection(collection_name=self.collection_name ,distance = "Cosine", vector_size=512)
27 qdrant_client.upload_collection(
28 collection_name=self.collection_name,

~/anaconda3/envs/casesolving/lib/python3.7/site-packages/qdrant_client/qdrant_client.py in recreate_collection(self, collection_name, vector_size, distance, hnsw_config, optimizers_config, wal_config)
252 hnsw_config=hnsw_config,
253 optimizers_config=optimizers_config,
--> 254 wal_config=wal_config
255 )
256 )

~/anaconda3/envs/casesolving/lib/python3.7/site-packages/qdrant_openapi_client/api/collections_api.py in update_collections(self, storage_operations)
228 ) -> m.InlineResponse2001:
229 return self._build_for_update_collections(
--> 230 storage_operations=storage_operations,
231 )

~/anaconda3/envs/casesolving/lib/python3.7/site-packages/qdrant_openapi_client/api/collections_api.py in build_for_update_collections(self, storage_operations)
183 body = jsonable_encoder(storage_operations)
184
--> 185 return self.api_client.request(type=m.InlineResponse2001, method="POST", url="/collections", json=body)
186
187

~/anaconda3/envs/casesolving/lib/python3.7/site-packages/qdrant_openapi_client/api_client.py in request(self, type_, method, url, path_params, **kwargs)
57 url = (self.host or "") + url.format(**path_params)
58 request = Request(method, url, **kwargs)
---> 59 return self.send(request, type_)
60
61 @overload

~/anaconda3/envs/casesolving/lib/python3.7/site-packages/qdrant_openapi_client/api_client.py in send(self, request, type_)
74
75 def send(self, request: Request, type_: Type[T]) -> T:
---> 76 response = self.middleware(request, self.send_inner)
77 if response.status_code in [200, 201]:
78 try:

~/anaconda3/envs/casesolving/lib/python3.7/site-packages/qdrant_openapi_client/api_client.py in call(self, request, call_next)
177 class BaseMiddleware:
178 def call(self, request: Request, call_next: Send) -> Response:
--> 179 return call_next(request)
180
181

~/anaconda3/envs/casesolving/lib/python3.7/site-packages/qdrant_openapi_client/api_client.py in send_inner(self, request)
86 response = self._client.send(request)
87 except Exception as e:
---> 88 raise ResponseHandlingException(e)
89 return response
90

ResponseHandlingException: timed out

Automatically get host and port from an environment variable

It would be useful to be able to do

qc.QdrantClient()

where the client will check if there the environment variables "QDRANT_IP, QDRANT_PORT, QDRANT_GRPC_PORT" present and if so will use those value.

Readme is outdated

hits = client.search(
    collection_name="my_collection",
    query_vector=query_vector,
    query_filter=None,  # Don't use any filters for now, search across all indexed points
    append_payload=True,  # Also return a stored payload for found points
    top=5  # Return 5 closest points
)

top is not valid anymore, it's limit and offsets now I think. Shall I go ahead and raise a PR for this?

Issue with payload type when creating a new point

Hello everyone! I have a problem creating a new point. When I set the payload for a new vector, they all turn into a string. Because of this, I cannot further filter vectors with Range().

client.http.points_api.update_points( name=COLLECTION_NAME, wait=True, collection_update_operations=PointOperationsAnyOf( upsert_points=PointInsertOperationsAnyOf1( points=[ PointStruct( id=123, payload={'value': random.random()}, vector=np.random.rand(DIM).tolist() ) ] ) ) )

Output:
(ScoredPoint(id=123, payload=None, score=24.59514, version=0), {'value': ['0.0510552054339094']})

'value': ['0.0510552054339094']
I tried to do update the payload but in the end, it was string type.

Providing a hostname that contains `https://` causes connection error

Overview

I was using the qdrant client to connect to a qdrant instance hosted on qdrant cloud. When I provided QdrantClient with my hostname that contained https:// I noticed that it refused to connect as it couldn't find the host. Removing https:// solves the problem.

Steps to reproduce:

Launch qdrant instance on qdrant cloud:
Set env variables:

export QDRANT_HOST="https://my-hostname.qdrant.io"
export QDRANT_PORT=6333
export QDRANT_API_KEY=my-api-key

Attempt connection to qdrant and pull information:

import os
from qdrant_client import QdrantClient

qdrant = QdrantClient(
  host=os.environ.get("QDRANT_HOST"),
  port=os.environ.get("QDRANT_PORT"),
  api_key=os.environ.get("QDRANT_API_KEY")
)
collection_info = qdrant.get_collection("my-collection")

Observed Behavior:

Error received:

ResponseHandlingException
[Errno 8] nodename nor servname provided, or not known

Expected Behavior:

I would expect QdrantClient to successfully connect to a host regardless if it was prefixed with https:// or not.

Notes

I was able to figure out that removing https:// would resolve the issue by inspecting the code. I see it's building up the scheme based on parameters. It'd be nice if it sanitized the URI prior to this, or at least warned a user that they have http in their host.

Unexpected Response: 403 (Forbidden) when connecting to an existing instance

Overview

I keep getting an Unexpected Response: 403 (Forbidden) error when trying to connect to an existing a Qdrant cloud instance. I haven't made any changes to the code, and generating new keys won't fix the issue.

Steps to reproduce

Create environment variables with host and API key
Run the following code:

import os

from qdrant_client import QdrantClient
from qdrant_client.http import models

host_name = os.getenv("QDRANT_HOST")
api_key = os.getenv("QDRANT_API_KEY")

client = QdrantClient(
    host=host_name, api_key=api_key
)

client.recreate_collection(
    collection_name="test-collection",
    vectors_config=models.VectorParams(size=100, distance=models.Distance.COSINE),
)

Observed Behavior

When I try to create a collection or do any operation with the cluster I get the following error:

Raw response content:
b'{"error":"key doesn\'t exist"}'

Expected Behavior

Connection to the instance is successful.

qdrant / qdrant-client Goto Github PK

qdrant-client's Introduction

Python Qdrant Client

Installation

Features

Local mode

Fast Embeddings + Simpler API

Connect to Qdrant server

Connect to Qdrant cloud

Examples

gRPC

Async client

Development

qdrant-client's People

Contributors

Stargazers

Watchers

Forkers

qdrant-client's Issues

Versions

Summary

Versions

Summary

Current Behavior

Steps to Reproduce

Expected Behavior

Possible Solution

Context (Environment)

Detailed Description

Possible Implementation

Overview

Steps to reproduce

Observed Behavior

Expected Behavior

Notes

Tasks

Overview

Description

Proposed solution

ResponseHandlingException: timed out

Overview

Steps to reproduce:

Observed Behavior:

Expected Behavior:

Notes

Overview

Steps to reproduce

Observed Behavior

Expected Behavior

Recommend Projects

Recommend Topics

Recommend Org