Comments (3)
Hi @Haskely,
Thanks your interest in lantern and for commenting on this!
You are right that the benchmarks in the Readme are out of date.
Tembo's findings seem reasonable.
We are doing a major upgrade of lantern with improved storage layer (see PR1, PR2).
I think pgvector will soon release parallel index builds, which we can compare to our external parallel index builds. pgvector has merged improvements on this front as well since we last benchmarked it and plan to rerun all those benchmarks after their next release.
So, we are working on improvements but I won't promise anything here and will let the benchmarks speak for themselves once they are out (will be within 10 days).
In particular, we are working on supporting the new hardware optimizations from usearch, enabling cpu-specific build flags, adding vector quantization techniques.
In the meantime, here are some of the reasons you might still want to consider lantern:
- Seamless embedding genration and maintenance - you just insert your text or image data into postgres, we create and maintain corresponding embeddings using one of supported open-source of proprietary embedding models
- External index generation which builds the index in parallel, outside of the DB machine and does not hog down its resources (can be done with a couple of clicks in our cloud!
- HNSW Index tuning experiments
from lantern.
Right.
For details on what it involves, you can look at unum-cloud/usearch#335 adding storage interface, and #262, building on that interface
from lantern.
@Ngalstyan4 thanks for the update:
In particular, we are working on supporting the new hardware optimizations from usearch, enabling cpu-specific build flags, adding vector quantization techniques.
I guess it is not as simple as updating the third_partry folder to point to a recent USearch?
from lantern.
Related Issues (20)
- Allow inserting vectors of arbitrary size on an empty index and pick up index dimension on first insert
- Only export symbols we expect postgres to use in `lantern.so`
- ldb_invariant() doesn't format the given string using the rest of the parameters
- consider enabling PostgreSQL data checksums
- throughput of inserted data HOT 16
- Flaky failure point test
- Parallel tests fail on upgrades
- Distance operator `<->` wrongly used in CROSS-JOIN queries without a vector index HOT 1
- Make sure tests in different test groups do not run in parallel
- Lantern binary versioning
- Add llvm bytecode generation to lantern build process
- Make sure reindex_lantern_indexes() properly calls lantern_reindex_external_index on updates
- Improve PQ API
- supabase compatible? HOT 1
- current main(fbf0d1704e1ec) build failure on Debian HOT 4
- [doc]: README.md does not specify prerequisites for build Lantern from source code
- [improvement]: current CheckMem function assumes vector type to be float array, which can be more precise.
- Docker setup doesn't work HOT 2
- Collect index statistics in a system table
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lantern.