Currently trying to utilise qdrant for a production usecase and this requires building

Thanks for the response <a class="user-mention notranslate" data-hovercard-type="user"

AsyncQdrantClient calls blocking event loop about qdrant-client HOT 3 OPEN

amey-wynk commented on September 26, 2024

AsyncQdrantClient calls blocking event loop

from qdrant-client.

Comments (3)

generall commented on September 26, 2024

Latency of the endpoint increases as concurrent users increase.

This is kinda expected. You can't scale indefinitely by just adding parallel calls.

But also considering that

memory and utilization of pods is <1 vCPU during testing.

there might be two possibilities, depending on your configuration it of the collection: either bottleneck is in disk, or bottleneck is on the client side.

Could you please try to do the same with gRPC?

from qdrant-client.

amey-wynk commented on September 26, 2024

Thanks for the response @generall

I know its not possible to scale indefinitely using parallel calls as you will bottleneck at some point. However this number should be very high compared to the tests I have ran and should not increase linearly for such small amount of requests (like the /sleep endpoint hits 110ms instead of 100ms). Also the request I'm sending qdrant is the most basic kind of request (a batch recommend request would be heavier on the CPU side). Even scaling that basic request to 2/3/5 parallel calls the latency increases almost linearly.

Regarding the CPU consumption, I've tried this with multiple configs for collections and the latency results are always the same. Just the CPU and memory usage changes on the pods. Also as mentioned the db is entirely in memory.

I've experimented with different shard_numbers, replication_factor and segment_count but the latency is always the same. I think I've assigned more than sufficient hardware resources to the DB as well. I've gone through the optimization section of the docs as well.

Is there some specific configuration of the collection I should try? (assuming the db is the bottleneck)

from qdrant-client.

generall commented on September 26, 2024

(assuming the db is the bottleneck)

If storage is all in memory and CPU usage on DB side is small, I don't think DB is a bettleneck, actually.

I would try to check what's the client CPU usage. You can check GET /telemetry to verify that latency. as observed from db side, correlates with client measurements

from qdrant-client.

Recommend Projects

AsyncQdrantClient calls blocking event loop about qdrant-client HOT 3 OPEN

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent