Comments (6)
Please look at the benchmarks we provide on various kind of data:
https://github.com/facebookresearch/faiss/wiki/Indexing-1G-vectors
which shows that indeed Faiss offers some improvement versus the previous solutions on typical data. Faiss is a library, so please read the papers cited in the wiki, which are the ones introducing the techniques themselves, such as the Polysemous codes.
More generally, in my opinion the question is vague because an acceleration of exact search (like the one we propose in Faiss both for Exact distance calculation or approximate one with PQ) is actually the best way to fight the curse of dimensionality when the conditions are the worst, i.e,. for random data with D very high.
Regards
from faiss.
Thanks for the prompt response.
What would you consider a high D ? I am talking about 2048 dimensions in my case..
from faiss.
The question is not really the (extrinsic) observable dimensionality, but the true internal complexity of the data. 2048 dimensions generated with rand or randn would be very high dimensional and you would need brute-force for that (such as IndexFlat, its GPU counterpart or IndexPQ/LSH).
2048 dimensions coming from the real world (e.g., neural net features) means vectors typically easier to index. Only empirical tests will tell you if you can index with non-exhaustive index, depending also on the desired accuracy/speed-up trade-off.
from faiss.
The dimensions are from neural net features, i will try to observe the performance based on different configurations.
from faiss.
If they are neural nets features, then they should be highly redundant and I would suggest to use the PCA pre-processing (available in Faiss):
https://github.com/facebookresearch/faiss/wiki/Pre--and-post-processing#pre-transforming-the-data
from faiss.
Excelent! Thanks
from faiss.
Related Issues (20)
- Is it possible to lazy load index from disk? HOT 1
- Binary embeddings score normalization HOT 1
- No conda package for faiss-cpu 1.8.0 for osx-64 on pytorch channel HOT 5
- Static library libfaiss_gpu.a not installed HOT 1
- faiss_gpu object is not linked to static library libfaiss.a HOT 3
- Error when building static library for AVX2 and GPU HOT 2
- Cannot debug similarity search HOT 1
- Add a tutorial for IndexHNSW HOT 3
- Segfault error on faiss.IndexIVFFlat().train HOT 1
- knn_gpu should use raft when raft is compiled in HOT 2
- ImportError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found HOT 1
- Remove lapack dependency? HOT 1
- Faiss imported after Torch leads to segfault HOT 2
- Suggestions on implementing multi-scale quantization HOT 3
- The similarity results obtained from the index.faiss file are significantly different from those obtained from previous versions HOT 1
- inquiry related to DistanceComputer HOT 2
- Failed to install via poetry HOT 1
- Update the raft handle through StandardGpuResourcesImpl::setDefaultStream
- [Feature Request] GPU indices Provide Interface to Access Resource HOT 2
- faiss index and retriever not able to save HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from faiss.