Offering a product suite for putting arbitrary models into production semantic search and retrieval-augmented LLM-chat experiences on your company's data
Documentation • Competitive Debate Search Demo • Competitive Debate Chat Demo • Discord
The objective behind this notebook was to assess the feasibility of substituting our system's SVD, Qdrant, with pgvector or lanterndb (roughly postgresql + usearch). Employing an OLTP solution like these postgresql focused ones would offer the advantage of utilizing a transactional database with schema and transaction support for both objects and vectors, thereby eliminating the need for external database joins during diverse search operations.
Star us on Github at github.com/arguflow/arguflow!!!!
Both pgvector and lanterndb are nearly as fast relative to Qdrant and can be equally accurate after tuning. This means that you should first place your vectors in both Qdrant and pgvector or lanterndb then tweak your HNSW index params, m
and ef_construction
, such that the postgres solution is just as accurate as Qdrant. Following that, move forward with postgres alone.
If you are not already using postgres and do not have requirements for an ACID compliant solution, then we would still recommend Qdrant. It has a lot of convience features, supports quantization, and does not require tuning to be accurate.
- Download the dataset via this link. This is roughly the DebateSum dataset, but with some improved parsing loggic and dedup detection as noted on our docs.
- Place the dataset into the same directory as this notebook
docker compose up -d
cat .env.dist > .env
- Run all to duplicate our findings