Git Product home page Git Product logo

tinkerbird's Introduction

TinkerBird

TinkerBird is a browser native vector database designed for efficient storage and retrieval of high-dimensional vectors (embeddings). It's query engine, written in TypeScript, leverages HNSW (Hierarchical Navigable Small World) indexes for fast vector retrieval. The storage layer utilizes IndexedDB, which could be extended with an lru-cache.

By co-locating data and embeddings, Tikerbird eliminates the roundtrip and reduces reliance on server-side interactions for vector search workloads. With Tinkerbird, sensitive data remains local, thus benefiting from vector search, without the associated cost, compliance and security risks.

TinkerBird uses IndexedDB as it's storage layer, which in turn builds upon Blobs and LevelDB storage systems. By using Indexeddb, it benefits from IndexedDB's adoption, stability and familiarity as a native choice for offline first workflows.

Examples

Here's a sample app built using TinkerBird. Check out Tinkerboard and Source.

Contributing

Feel free to contribute to TinkerBird by sending us your suggestions, bug reports, or cat videos. Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

License

Distributed under the MIT License. See LICENSE for more information. TinkerBird is provided "as is" and comes with absolutely no guarantees. We take no responsibility for irrelevant searches, confused users, or existential crises induced by unpredictable results. If it breaks, well, that's your problem now! jk.

References

tinkerbird's People

Contributors

wizenheimer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

tinkerbird's Issues

HNSW index fails integrity checks after loading index from IndexedDB

Issue Summary:

When querying the HNSW index after loading it, the query results differ from those obtained when building the index from scratch. The discrepancy is observed despite using the same query and similarity functions. Debugging reveals that the entry point ID and visited nodes are consistent, but the query results are not matching expectations.

Steps to Reproduce:

  1. Build the HNSW index from scratch.
  2. Perform a query on the index and note the results.
  3. Save the index.
  4. Load the index from storage.
  5. Perform the same query on the loaded index.
  6. Observe that the query results differ from the initial results.

Additional Information:

  • The index is saved and loaded using methods that store both node information and metadata (dimension, collection name, neighbors, entry point ID, etc.).
  • The issue persists even after waiting for the loadMeta method to complete before returning from the loadIndex method.
  • Debug logs have been added to the query method, showing that the entry point ID and visited nodes are consistent, but the query results are not.

Expected Behavior:

Query results after loading the index should match the results obtained when building the index from scratch, given the same query and similarity functions.

Request for Assistance:

Seeking guidance on identifying the root cause of the inconsistency in query results after loading the index and suggestions for potential debugging or corrective measures.

README refers to a missing `LICENSE` file

First of all, thank you for creating and sharing this project with us!

The README says this project is under the MIT license and refers to a LICENSE file:

tinkerbird/README.md

Lines 231 to 233 in 4c987b6

## License
Distributed under the MIT License. See [LICENSE](LICENSE) for more information.

However, there's no LICENSE file in this repo.

Since you said it's MIT-licensed, please add a LICENSE file; you can use this MIT license template (the site is by GitHub): https://choosealicense.com/licenses/mit/ or you can use https://opensource.org/license/mit, but GitHub's site is providing an already properly formatted and word-wrapped text file template, which is easier to read, so I suggest using that one.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.