Comments (3)
Latency of the endpoint increases as concurrent users increase.
This is kinda expected. You can't scale indefinitely by just adding parallel calls.
But also considering that
memory and utilization of pods is <1 vCPU during testing.
there might be two possibilities, depending on your configuration it of the collection: either bottleneck is in disk, or bottleneck is on the client side.
Could you please try to do the same with gRPC?
from qdrant-client.
Thanks for the response @generall
I know its not possible to scale indefinitely using parallel calls as you will bottleneck at some point. However this number should be very high compared to the tests I have ran and should not increase linearly for such small amount of requests (like the /sleep endpoint hits 110ms instead of 100ms). Also the request I'm sending qdrant is the most basic kind of request (a batch recommend request would be heavier on the CPU side). Even scaling that basic request to 2/3/5 parallel calls the latency increases almost linearly.
Regarding the CPU consumption, I've tried this with multiple configs for collections and the latency results are always the same. Just the CPU and memory usage changes on the pods. Also as mentioned the db is entirely in memory.
I've experimented with different shard_numbers, replication_factor and segment_count but the latency is always the same. I think I've assigned more than sufficient hardware resources to the DB as well. I've gone through the optimization section of the docs as well.
Is there some specific configuration of the collection I should try? (assuming the db is the bottleneck)
from qdrant-client.
(assuming the db is the bottleneck)
If storage is all in memory and CPU usage on DB side is small, I don't think DB is a bettleneck, actually.
I would try to check what's the client CPU usage. You can check GET /telemetry
to verify that latency. as observed from db side, correlates with client measurements
from qdrant-client.
Related Issues (20)
- qdrant cloud: retrieve collection metrics via python client HOT 1
- NotImplementedError: cannot instantiate 'WindowsPath' on your system HOT 2
- pydantic <=2.6.0 converts bool to 0/1 and breaks bool filters HOT 2
- Qdrant Scroll Method Timeout HOT 8
- Mounting Local Volume for Langchain Embeddings in Docker Container HOT 8
- can not get "time" from api using the sdk HOT 4
- Extra inputs are not permitted HOT 5
- `init_from is deprecated` warnings on create collection calls HOT 2
- fix orderby conversion with pydantic 1.10.x HOT 1
- The problem I encountered when using celery task to execute qdrant in fastApi HOT 1
- Hybrid search throws error when `prefer_grpc=True` HOT 5
- Missing Example for Using ColBERT with Python Client in Blog and Documentation HOT 1
- [BUG] test_referenced_vectors crashes with validation errors on pydantic v1.10.x HOT 1
- Setting Parameters in Qdrant Client running in " Memory" Issue: HOT 1
- bug: qdrant using poetry HOT 6
- add tests for query_points with Document HOT 1
- Unlock pyright dependency
- query_points raise exception: ResponseHandlingException: 1 validation error for ParsingModel[InlineResponse20021] (for parse_as_type) HOT 12
- Support async in upload_collection HOT 1
- Limiting RAM in Qdrant
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from qdrant-client.