Git Product home page Git Product logo

faiss's Introduction

Faiss

Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning. Faiss is written in C++ with complete wrappers for Python/numpy. Some of the most useful algorithms are implemented on the GPU. It is developed primarily at Meta's Fundamental AI Research group.

News

See CHANGELOG.md for detailed information about latest features.

Introduction

Faiss contains several methods for similarity search. It assumes that the instances are represented as vectors and are identified by an integer, and that the vectors can be compared with L2 (Euclidean) distances or dot products. Vectors that are similar to a query vector are those that have the lowest L2 distance or the highest dot product with the query vector. It also supports cosine similarity, since this is a dot product on normalized vectors.

Some of the methods, like those based on binary vectors and compact quantization codes, solely use a compressed representation of the vectors and do not require to keep the original vectors. This generally comes at the cost of a less precise search but these methods can scale to billions of vectors in main memory on a single server. Other methods, like HNSW and NSG add an indexing structure on top of the raw vectors to make searching more efficient.

The GPU implementation can accept input from either CPU or GPU memory. On a server with GPUs, the GPU indexes can be used a drop-in replacement for the CPU indexes (e.g., replace IndexFlatL2 with GpuIndexFlatL2) and copies to/from GPU memory are handled automatically. Results will be faster however if both input and output remain resident on the GPU. Both single and multi-GPU usage is supported.

Installing

Faiss comes with precompiled libraries for Anaconda in Python, see faiss-cpu and faiss-gpu. The library is mostly implemented in C++, the only dependency is a BLAS implementation. Optional GPU support is provided via CUDA, and the Python interface is also optional. It compiles with cmake. See INSTALL.md for details.

How Faiss works

Faiss is built around an index type that stores a set of vectors, and provides a function to search in them with L2 and/or dot product vector comparison. Some index types are simple baselines, such as exact search. Most of the available indexing structures correspond to various trade-offs with respect to

  • search time
  • search quality
  • memory used per index vector
  • training time
  • adding time
  • need for external data for unsupervised training

The optional GPU implementation provides what is likely (as of March 2017) the fastest exact and approximate (compressed-domain) nearest neighbor search implementation for high-dimensional vectors, fastest Lloyd's k-means, and fastest small k-selection algorithm known. The implementation is detailed here.

Full documentation of Faiss

The following are entry points for documentation:

Authors

The main authors of Faiss are:

  • Hervé Jégou initiated the Faiss project and wrote its first implementation
  • Matthijs Douze implemented most of the CPU Faiss
  • Jeff Johnson implemented all of the GPU Faiss
  • Lucas Hosseini implemented the binary indexes and the build system
  • Chengqi Deng implemented NSG, NNdescent and much of the additive quantization code.
  • Alexandr Guzhva many optimizations: SIMD, memory allocation and layout, fast decoding kernels for vector codecs, etc.
  • Gergely Szilvasy build system, benchmarking framework.

Reference

References to cite when you use Faiss in a research paper:

@article{douze2024faiss,
      title={The Faiss library},
      author={Matthijs Douze and Alexandr Guzhva and Chengqi Deng and Jeff Johnson and Gergely Szilvasy and Pierre-Emmanuel Mazaré and Maria Lomeli and Lucas Hosseini and Hervé Jégou},
      year={2024},
      eprint={2401.08281},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

For the GPU version of Faiss, please cite:

@article{johnson2019billion,
  title={Billion-scale similarity search with {GPUs}},
  author={Johnson, Jeff and Douze, Matthijs and J{\'e}gou, Herv{\'e}},
  journal={IEEE Transactions on Big Data},
  volume={7},
  number={3},
  pages={535--547},
  year={2019},
  publisher={IEEE}
}

Join the Faiss community

For public discussion of Faiss or for questions, there is a Facebook group at https://www.facebook.com/groups/faissusers/

We monitor the issues page of the repository. You can report bugs, ask questions, etc.

Legal

Faiss is MIT-licensed, refer to the LICENSE file in the top level directory.

Copyright © Meta Platforms, Inc. See the Terms of Use and Privacy Policy for this project.

faiss's People

Contributors

abdelrahmanelmeniawy avatar ailzhang avatar alexanderguzhva avatar algoriddle avatar ava57r avatar beauby avatar bladepan avatar borismansencal avatar chasingegg avatar cjnolet avatar denisyaroshevskiy avatar enet4 avatar glutamatt avatar h-vetinari avatar hhy3 avatar jinhai-cn avatar junjieqi avatar kaelen-moda avatar kinglittleq avatar kuarora avatar kyamagu avatar mdouze avatar mlomeli1 avatar pletessier avatar r-barnes avatar ramilbakhshyiev avatar vorj avatar wickedfoo avatar wx257osn2 avatar wxingda avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

faiss's Issues

Use setuptools to install Python package

I think the lib should have a setup.py install script, so one can easily install it with python setup.py install. What do you think?

I created a first version in this branch: https://github.com/brunodoamaral/faiss/tree/setup-py

The only problem is that I build the .so file using make, instead of creating an Extension on setup.py (I tried it before, but couldn't get makefile.inc parameters to work inside setuptools).

faiss::Clustering::train failed

Full log:

Clustering 138982590 points in 64D to 4096 clusters, redo 2 times, 40 iterations
Preprocessing in 17.1445 s
Outer iteration 0 / 2
Iteration 39 (4086.49 s, search 3227.36 s): objective=1.88871e+07 imbalance=1.711 nsplit=0
Keep new clusters
Outer iteration 1 / 2
Faiss assertion index.ntotal == 0 failed in virtual void faiss::Clustering::train(faiss::Clustering::idx_t, const float*, faiss::Index&) at Clustering.cpp:152Aborted (core dumped)

Used code from https://github.com/facebookresearch/faiss/wiki/GPU-k-means-example

Cmake build scripts?

While the current Makefile solution is very nice to tinker with and easy to debug, it might be beneficial if we have a parallel cmake solution - that could potentially benefit a bunch of software packages that uses CMake and wants to depend on faiss.

(chatted with @mdouze offline and maybe this is something that the community can collectively maintain? :) )

How can I use faiss for faces clustering?

Hi, I tested faiss to search face in big faces set and got very good result.
A face feature is 128 dim vector and distance use L2 norm.
But I don't know how to clustering faces into similar faces when I don't know cluster count.
More ever the faces set may be grown.
I can set similarity threshold.
Please teach me good clustering method or algorithm.

Using as a library

How could I use this as an library for my simple program where I am searching similar strings from a set of billion strings for a query string?

Is there a persistence function [python]?

Hi everyone

whether there is a way to save cluster/index into a local file?

For example:

ncentroids = 1024
niter = 20
verbose = true
d = x.shape[1]
kmeans = faiss.Kmeans(d, ncentroids, niter, verbose)
kmeans.train (x)

after kmeans.train(x), I want to save the result into a local file, then next time I only need to load the file, and search cluster as follows (instead of re-run the clustering on the original data)

D, I = kmeans.index.search(x, 1)

curse of dimensionality and faiss

Hello

KNN by its nature suffers from curse of dimensionality, and my question is whether or not Faiss has done anything to 'improve' these kind of searches.

Regards

Got *Intel MKL FATAL ERROR* when running Python example

I'm trying faiss on Ubuntu.

Both make and make py ran successfully.
The demo tests/demo_sift1M also ran correctly, I guessed this meant the C++ core is successfully installed.
Then I went on to check the Python wrapper, with the one liner documented here:

root@428311598e54:/tmp/faiss-master# LD_LIBRARY_PATH=/opt/intel/compilers_and_librarie
s_2016.3.210/linux/mkl/lib/intel64 python -c "import faiss, numpy
faiss.Kmeans(10, 20).train(numpy.random.rand(1000, 10).astype('float32'))"
Failed to load GPU Faiss: No module named swigfaiss_gpu
Faiss falling back to CPU-only.
Intel MKL FATAL ERROR: Cannot load libmkl_avx2.so or libmkl_def.so.

LD_LIBRARY_PATH is correctly set, and both modules are available in that path, but it failed with an Intel MKL FATAL ERROR.

root@428311598e54:/tmp/faiss-master# ls /opt/intel/compilers_and_libraries_2016.3.210/linux/mkl/lib/intel64/{libmkl_core.so,libmkl_avx2.so}
/opt/intel/compilers_and_libraries_2016.3.210/linux/mkl/lib/intel64/libmkl_avx2.so
/opt/intel/compilers_and_libraries_2016.3.210/linux/mkl/lib/intel64/libmkl_core.so

Vector normalization

Hello,

Are the input vectors expected to be approximately of the same norm? I didn't see such requirements in the README but in practice I have had trouble if the norms of the vector vary too much.

Thanks in advance,

ImportError: No module named faiss

i have a problem when i follow the install steps of faiss in step 2

Real-life test
--------------

The following script extends the demo_sift1M test to several types of
indexes:

  python python/demo_auto_tune.py

It will cycle through a few types of indexes and find optimal
operating points. You can play around with the types of indexes.

i get the information about this.

python python/demo_auto_tune.py
Traceback (most recent call last):
  File "python/demo_auto_tune.py", line 19, in <module>
    import faiss
ImportError: No module named faiss

thank you for your help!

Segfault if quantizer falls out of scope for IndexIVFFlat

I can successfully train the index that is returned by make_index_1 but training the index from make_index_2 causes a segmentation fault. The only difference is that the first function returns the quantizer.

#!/bin/python3

import faiss
import numpy as np

def samples(d, count):
    samples = np.random.random((count, d)).astype('float32')
    samples[:, 0] += np.arange(count) / 1000.
    return samples

def make_index_1(d):
    nlist = 100
    quantizer = faiss.IndexFlatL2(d)
    index = faiss.IndexIVFFlat(quantizer, d, nlist, faiss.METRIC_L2)

    return index, quantizer

def make_index_2(d):
    nlist = 100
    quantizer = faiss.IndexFlatL2(d)
    index = faiss.IndexIVFFlat(quantizer, d, nlist, faiss.METRIC_L2)

    return index

d = 32
examples = samples(d, 10000)
index1, q = make_index_1(d)
index1.train(examples)
print('OK')

index2 = make_index_2(d)
index2.train(examples) # segfaults here
print('OK')

IOError: [Errno 2] No such file or directory: 'tmp/demo_auto_tune.png'

Dear contributor,i encountered another problem in step 2.
after using the command export PYTHONPATH=.
this command python python/demo_auto_tune.py can be executed
however at the end of execution,it reports a error like below:

cno=12 key=nprobe=4096 perf=0.9870 t=11.214 
Traceback (most recent call last):
  File "python/demo_auto_tune.py", line 168, in <module>
    fig.savefig('tmp/demo_auto_tune.png')
  File "/usr/lib/pymodules/python2.7/matplotlib/figure.py", line 1421, in savefig
    self.canvas.print_figure(*args, **kwargs)
  File "/usr/lib/pymodules/python2.7/matplotlib/backend_bases.py", line 2220, in print_figure
    **kwargs)
  File "/usr/lib/pymodules/python2.7/matplotlib/backends/backend_agg.py", line 510, in print_png
    filename_or_obj = open(filename_or_obj, 'wb')
IOError: [Errno 2] No such file or directory: 'tmp/demo_auto_tune.png'

i don't know whether this problem is due to the matplotlib module of my python.
thank you for your patient reply!

Segmentation fault when running demo_ivfpq_indexing_gpu

After compiling both the CPU and the GPU version, the CPU test completed successfully, but the GPU test fails with a Segmentation fault when "Adding the vectors to the index". Unfortunately the error message is not very verbose:

[0.562 s] Generating 100000 vectors in 128D for training
[0.707 s] Training the index
Training IVF quantizer on 100000 vectors in 128D
Clustering 100000 points in 128D to 1788 clusters, redo 1 times, 10 iterations
Preprocessing in 0.00984204 s
Iteration 9 (0.57 s, search 0.38 s): objective=930934 imbalance=1.255 nsplit=0
computing residuals
training 4 x 256 product quantizer on 16384 vectors in 128D
Training PQ slice 0/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
Preprocessing in 0.000524902 s
Iteration 24 (2.06 s, search 1.68 s): objective=27271.5 imbalance=1.018 nsplit=0
Training PQ slice 1/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
Preprocessing in 0.000452148 s
Iteration 24 (1.76 s, search 1.41 s): objective=27193.4 imbalance=1.016 nsplit=0
Training PQ slice 2/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
Preprocessing in 0.000414062 s
Iteration 24 (1.54 s, search 1.23 s): objective=27230.8 imbalance=1.021 nsplit=0
Training PQ slice 3/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
Preprocessing in 0.000456055 s
Iteration 24 (1.73 s, search 1.39 s): objective=27174 imbalance=1.023 nsplit=0
[8.535 s] storing the pre-trained index to /tmp/index_trained.faissindex
[8.569 s] Building a dataset of 200000 vectors to index
[8.813 s] Adding the vectors to the index
Segmentation fault (core dumped)

I've got a Ubuntu 16.04 system with 2 GeForce GTX 970, both have 4Gb of memory. Any idea what I am doing wrong?

OS X make: "Undefined symbols for architecture x86_64"

I cannot compile for OS X (cc @blandinw). It seems related to the standard library not being found correctly, but I'm not familiar with how to correctly configure compilation when using clang installed by homebrew. I also tried installing g++-6 with homebrew and using the other Mac makefile (but switching the binary path to the homebrew binary path). Any advice would be appreciated. Here is my process, on OS X 10.11.6:

$ git clone https://github.com/facebookresearch/faiss.git
$ cd faiss
$ cp example_makefiles/makefile.inc.Mac.brew makefile.inc
$ brew install llvm
$ brew install swig
$ make
/usr/local/opt/llvm/bin/clang++ -o tests/demo_ivfpq_indexing -fPIC -m64 -Wall -g -O3 -msse4 -mpopcnt -fopenmp -Wno-sign-compare -Dnullptr=NULL -I/usr/local/opt/llvm/include -Doverride= tests/demo_ivfpq_indexing.cpp libfaiss.a -g -fPIC -fopenmp -L/usr/local/opt/llvm/lib -framework Accelerate
ld: warning: could not create compact unwind for __ZN5faiss9matrix_qrEiiPf: dwarf uses DW_CFA_GNU_args_size
ld: warning: could not create compact unwind for __ZN5faiss9PCAMatrix10prepare_AbEv: dwarf uses DW_CFA_GNU_args_size
ld: warning: could not create compact unwind for __ZNK5faiss11IndexShards6searchElPKflPfPl: dwarf uses DW_CFA_GNU_args_size
Undefined symbols for architecture x86_64:
  "std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::size() const", referenced from:
      std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) in libfaiss.a(VectorTransform.o)
      faiss::PolysemousTraining::optimize_reproduce_distances(faiss::ProductQuantizer&) const in libfaiss.a(PolysemousTraining.o)
      faiss::PolysemousTraining::optimize_ranking(faiss::ProductQuantizer&, unsigned long, float const*) const in libfaiss.a(PolysemousTraining.o)
      std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) in libfaiss.a(MetaIndexes.o)
  "std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::c_str() const", referenced from:
      faiss::PolysemousTraining::optimize_reproduce_distances(faiss::ProductQuantizer&) const in libfaiss.a(PolysemousTraining.o)
      faiss::PolysemousTraining::optimize_ranking(faiss::ProductQuantizer&, unsigned long, float const*) const in libfaiss.a(PolysemousTraining.o)
  "std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::str() const", referenced from:
      faiss::IndexIVFPQ::set_typename() in libfaiss.a(IndexIVFPQ.o)
      faiss::IndexIVFPQR::set_typename() in libfaiss.a(IndexIVFPQ.o)
      faiss::IndexIVFFlat::set_typename() in libfaiss.a(IndexIVF.o)
      faiss::IndexFlat::set_typename() in libfaiss.a(IndexFlat.o)
      faiss::IndexRefineFlat::set_typename() in libfaiss.a(IndexFlat.o)
      faiss::IndexPQ::set_typename() in libfaiss.a(IndexPQ.o)
      faiss::MultiIndexQuantizer::set_typename() in libfaiss.a(IndexPQ.o)
      ...
  "std::allocator<char>::allocator()", referenced from:
      faiss::Index::Index(long, faiss::MetricType) in libfaiss.a(index_io.o)
      faiss::Index::Index(long, faiss::MetricType) in libfaiss.a(IndexIVF.o)
      faiss::Index::Index(long, faiss::MetricType) in libfaiss.a(IndexFlat.o)
      faiss::Index::Index(long, faiss::MetricType) in libfaiss.a(VectorTransform.o)
      faiss::Index::Index(long, faiss::MetricType) in libfaiss.a(IndexPQ.o)
      faiss::Index::Index(long, faiss::MetricType) in libfaiss.a(IndexLSH.o)
      faiss::Index::Index(long, faiss::MetricType) in libfaiss.a(MetaIndexes.o)
      ...
  "std::allocator<char>::~allocator()", referenced from:
      faiss::Index::Index(long, faiss::MetricType) in libfaiss.a(index_io.o)
      faiss::Index::Index(long, faiss::MetricType) in libfaiss.a(IndexIVF.o)
      faiss::Index::Index(long, faiss::MetricType) in libfaiss.a(IndexFlat.o)
      faiss::Index::Index(long, faiss::MetricType) in libfaiss.a(VectorTransform.o)
      faiss::Index::Index(long, faiss::MetricType) in libfaiss.a(IndexPQ.o)
      faiss::Index::Index(long, faiss::MetricType) in libfaiss.a(IndexLSH.o)
      faiss::Index::Index(long, faiss::MetricType) in libfaiss.a(MetaIndexes.o)
      ...
  "std::ostream::operator<<(int)", referenced from:
      faiss::IndexLSH::set_typename() in libfaiss.a(IndexLSH.o)
  "std::ostream::operator<<(unsigned long)", referenced from:
      faiss::IndexIVFPQ::set_typename() in libfaiss.a(IndexIVFPQ.o)
      faiss::IndexIVFPQR::set_typename() in libfaiss.a(IndexIVFPQ.o)
      faiss::IndexIVFFlat::set_typename() in libfaiss.a(IndexIVF.o)
      faiss::IndexPQ::set_typename() in libfaiss.a(IndexPQ.o)
      faiss::MultiIndexQuantizer::set_typename() in libfaiss.a(IndexPQ.o)
  "std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::append(char const*)", referenced from:
      std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, char const*) in libfaiss.a(VectorTransform.o)
      std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, char const*) in libfaiss.a(MetaIndexes.o)
  "std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::append(char const*, unsigned long)", referenced from:
      std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) in libfaiss.a(VectorTransform.o)
      std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) in libfaiss.a(MetaIndexes.o)
  "std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::append(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)", referenced from:
      std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) in libfaiss.a(VectorTransform.o)
      std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) in libfaiss.a(MetaIndexes.o)
  "std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::reserve(unsigned long)", referenced from:
      std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) in libfaiss.a(VectorTransform.o)
      std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) in libfaiss.a(MetaIndexes.o)
  "std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&)", referenced from:
      std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, char const*) in libfaiss.a(VectorTransform.o)
      std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, char const*) in libfaiss.a(MetaIndexes.o)
  "std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)", referenced from:
      faiss::Index::Index(long, faiss::MetricType) in libfaiss.a(index_io.o)
      faiss::Index::Index(long, faiss::MetricType) in libfaiss.a(IndexIVF.o)
      faiss::Index::Index(long, faiss::MetricType) in libfaiss.a(IndexFlat.o)
      faiss::Index::Index(long, faiss::MetricType) in libfaiss.a(VectorTransform.o)
      faiss::Index::Index(long, faiss::MetricType) in libfaiss.a(IndexPQ.o)
      faiss::Index::Index(long, faiss::MetricType) in libfaiss.a(IndexLSH.o)
      faiss::Index::Index(long, faiss::MetricType) in libfaiss.a(MetaIndexes.o)
      ...
  "std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)", referenced from:
      __ZNK5faiss5Index12get_typenameB5cxx11Ev in libfaiss.a(IndexIVFPQ.o)
      __ZNK5faiss5Index12get_typenameB5cxx11Ev in libfaiss.a(index_io.o)
      faiss::Index::Index(faiss::Index const&) in libfaiss.a(index_io.o)
      faiss::PolysemousTraining::PolysemousTraining(faiss::PolysemousTraining const&) in libfaiss.a(index_io.o)
      __ZNK5faiss5Index12get_typenameB5cxx11Ev in libfaiss.a(Index.o)
      __ZNK5faiss5Index12get_typenameB5cxx11Ev in libfaiss.a(IndexIVF.o)
      __ZNK5faiss5Index12get_typenameB5cxx11Ev in libfaiss.a(IndexFlat.o)
      ...
  "std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string()", referenced from:
      std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) in libfaiss.a(VectorTransform.o)
      faiss::PolysemousTraining::PolysemousTraining() in libfaiss.a(PolysemousTraining.o)
      faiss::PolysemousTraining::PolysemousTraining() in libfaiss.a(PolysemousTraining.o)
      std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) in libfaiss.a(MetaIndexes.o)
  "std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()", referenced from:
      faiss::IndexIVFPQ::set_typename() in libfaiss.a(IndexIVFPQ.o)
      faiss::IndexIVFPQR::set_typename() in libfaiss.a(IndexIVFPQ.o)
      faiss::PolysemousTraining::~PolysemousTraining() in libfaiss.a(IndexIVFPQ.o)
      faiss::Index::~Index() in libfaiss.a(index_io.o)
      faiss::IndexIVFFlat::set_typename() in libfaiss.a(IndexIVF.o)
      faiss::Index::~Index() in libfaiss.a(IndexIVF.o)
      faiss::IndexFlat::set_typename() in libfaiss.a(IndexFlat.o)
      ...
  "std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::operator=(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&)", referenced from:
      faiss::IndexIVFPQ::set_typename() in libfaiss.a(IndexIVFPQ.o)
      faiss::IndexIVFPQR::set_typename() in libfaiss.a(IndexIVFPQ.o)
      faiss::IndexIVFFlat::set_typename() in libfaiss.a(IndexIVF.o)
      faiss::IndexFlat::set_typename() in libfaiss.a(IndexFlat.o)
      faiss::IndexRefineFlat::set_typename() in libfaiss.a(IndexFlat.o)
      faiss::IndexPreTransform::set_typename() in libfaiss.a(VectorTransform.o)
      faiss::IndexPQ::set_typename() in libfaiss.a(IndexPQ.o)
      ...
  "std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::operator=(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)", referenced from:
      faiss::Index::operator=(faiss::Index const&) in libfaiss.a(index_io.o)
  "std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::basic_stringstream(std::_Ios_Openmode)", referenced from:
      faiss::IndexIVFPQ::set_typename() in libfaiss.a(IndexIVFPQ.o)
      faiss::IndexIVFPQR::set_typename() in libfaiss.a(IndexIVFPQ.o)
      faiss::IndexIVFFlat::set_typename() in libfaiss.a(IndexIVF.o)
      faiss::IndexFlat::set_typename() in libfaiss.a(IndexFlat.o)
      faiss::IndexRefineFlat::set_typename() in libfaiss.a(IndexFlat.o)
      faiss::IndexPQ::set_typename() in libfaiss.a(IndexPQ.o)
      faiss::MultiIndexQuantizer::set_typename() in libfaiss.a(IndexPQ.o)
      ...
  "std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::~basic_stringstream()", referenced from:
      faiss::IndexIVFPQ::set_typename() in libfaiss.a(IndexIVFPQ.o)
      faiss::IndexIVFPQR::set_typename() in libfaiss.a(IndexIVFPQ.o)
      faiss::IndexIVFFlat::set_typename() in libfaiss.a(IndexIVF.o)
      faiss::IndexFlat::set_typename() in libfaiss.a(IndexFlat.o)
      faiss::IndexRefineFlat::set_typename() in libfaiss.a(IndexFlat.o)
      faiss::IndexPQ::set_typename() in libfaiss.a(IndexPQ.o)
      faiss::MultiIndexQuantizer::set_typename() in libfaiss.a(IndexPQ.o)
      ...
  "std::__throw_length_error(char const*)", referenced from:
      std::vector<std::vector<unsigned char, std::allocator<unsigned char> >, std::allocator<std::vector<unsigned char, std::allocator<unsigned char> > > >::_M_check_len(unsigned long, char const*) const in libfaiss.a(IndexIVFPQ.o)
      std::vector<long, std::allocator<long> >::_M_check_len(unsigned long, char const*) const in libfaiss.a(IndexIVFPQ.o)
      std::vector<unsigned char, std::allocator<unsigned char> >::_M_check_len(unsigned long, char const*) const in libfaiss.a(IndexIVFPQ.o)
      std::vector<float, std::allocator<float> >::_M_check_len(unsigned long, char const*) const in libfaiss.a(IndexIVFPQ.o)
      std::vector<float const*, std::allocator<float const*> >::_M_check_len(unsigned long, char const*) const in libfaiss.a(IndexIVFPQ.o)
      std::vector<float, std::allocator<float> >::_M_check_len(unsigned long, char const*) const in libfaiss.a(index_io.o)
      std::vector<int, std::allocator<int> >::_M_check_len(unsigned long, char const*) const in libfaiss.a(index_io.o)
      ...
  "std::basic_ostream<char, std::char_traits<char> >& std::operator<<<std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)", referenced from:
      faiss::IndexIVFPQ::set_typename() in libfaiss.a(IndexIVFPQ.o)
      faiss::IndexIVFPQR::set_typename() in libfaiss.a(IndexIVFPQ.o)
      faiss::IndexIVFFlat::set_typename() in libfaiss.a(IndexIVF.o)
      faiss::IndexFlat::set_typename() in libfaiss.a(IndexFlat.o)
      faiss::IndexRefineFlat::set_typename() in libfaiss.a(IndexFlat.o)
      faiss::IndexPQ::set_typename() in libfaiss.a(IndexPQ.o)
      faiss::MultiIndexQuantizer::set_typename() in libfaiss.a(IndexPQ.o)
      ...
  "std::basic_ostream<char, std::char_traits<char> >& std::operator<<<std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char)", referenced from:
      faiss::IndexRefineFlat::set_typename() in libfaiss.a(IndexFlat.o)
  "std::basic_ostream<char, std::char_traits<char> >& std::operator<<<char, std::char_traits<char>, std::allocator<char> >(std::basic_ostream<char, std::char_traits<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)", referenced from:
      faiss::IndexIVFPQ::set_typename() in libfaiss.a(IndexIVFPQ.o)
      faiss::IndexIVFPQR::set_typename() in libfaiss.a(IndexIVFPQ.o)
      faiss::IndexIVFFlat::set_typename() in libfaiss.a(IndexIVF.o)
      faiss::IndexRefineFlat::set_typename() in libfaiss.a(IndexFlat.o)
ld: symbol(s) not found for architecture x86_64
clang-4.0: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [tests/demo_ivfpq_indexing] Error 1

assertion always fail with GpuIndexIVFPQ

Here is the details of the error:
Faiss assertion usePrecomputed_ || IVFPQ::isSupportedNoPrecomputedSubDimSize( this->d / subQuantizers_) failed in void faiss::gpu::GpuIndexIVFPQ::assertSettings_() const at GpuIndexIVFPQ.cu:469Aborted (core dumped)

And here is part of my code:

res = faiss.StandardGpuResources()

index = faiss.index_factory(d, "OPQ8,IVF4096,PQ8")

co = faiss.GpuClonerOptions()
co.useFloat16 = False
co.usePrecomputed = True
co.indicesOptions = faiss.INDICES_CPU

index = faiss.index_cpu_to_gpu(res, 0, index, co)  # error happends here

d is 1000.

faiss::gpu::runMatrixMult failure

The full log:
Faiss assertion err == CUBLAS_STATUS_SUCCESS failed in void faiss::gpu::runMatrixMult(faiss::gpu::Tensor<T, 2, true>&, bool, faiss::gpu::Tensor<T, 2, true>&, bool, faiss::gpu::Tensor<T, 2, true>&, bool, float, float, cublasHandle_t, cudaStream_t) [with T = float; cublasHandle_t = cublasContext*; cudaStream_t = CUstream_st*] at utils/MatrixMult.cu:141Aborted (core dumped)

I have successfully run demo_ivfpq_indexing_gpu, which I think the faiss was installed successfully.

Pick features from index [python]

Hi everyone, is there a way to pick features from index file?

Assuming that I have a large dataset xb, and I have save the index to a file PATH.
It seems that faiss has stored xb to the index?

index = faiss.IndexFlatL2(d)
index.add(xb)
faiss.write_index(index, PATH)

Then I load the index, and want to obtain xb[0] from the index.

index = faiss.read_index(PATH)
xb[0] = ?

Index intersection

I have a few thousand sets of vectors for which I need to answer ClosestPairOfPoints(A, B) queries, that is find element a from A and b from B that minimize d(a, b). Is there a way I could support such queries with Faiss?

Thanks!

Make fails with errors on ubuntu 14.04 with OpenBlas

After installing openblas
apt-get update && apt-get install -y libopenblas-dev
and changing Makefile.inc
Here is my Makefile.inc
Makefile.inc : https://github.com/AKSHAYUBHAT/DeepVideoAnalytics/blob/master/faiss/makefile.inc

make fails with following error

++ -fPIC -m64 -Wall -g -O3  -msse4 -mpopcnt -fopenmp -Wno-sign-compare -Dnullptr=NULL -Doverride= -fopenmp -c AutoTune.cpp -o AutoTune.o  
g++ -fPIC -m64 -Wall -g -O3  -msse4 -mpopcnt -fopenmp -Wno-sign-compare -Dnullptr=NULL -Doverride= -fopenmp -c AuxIndexStructures.cpp -o AuxIndexStructures.o  
ar r libfaiss.a hamming.o utils.o IndexFlat.o IndexIVF.o IndexLSH.o IndexPQ.o IndexIVFPQ.o Clustering.o Heap.o VectorTransform.o index_io.o PolysemousTraining.o MetaIndexes.o Index.o ProductQuantizer.o AutoTune.o AuxIndexStructures.o
ar: creating libfaiss.a
g++ -o tests/demo_ivfpq_indexing -fPIC -m64 -Wall -g -O3  -msse4 -mpopcnt -fopenmp -Wno-sign-compare -Dnullptr=NULL -Doverride= -fopenmp tests/demo_ivfpq_indexing.cpp libfaiss.a -g -fPIC  -fopenmp /usr/lib/libopenblas.so.0
libfaiss.a(utils.o): In function `faiss::matrix_qr(int, int, float*)':
/root/DVA/faiss/utils.cpp:1215: undefined reference to `sgeqrf_'
/root/DVA/faiss/utils.cpp:1220: undefined reference to `sgeqrf_'
/root/DVA/faiss/utils.cpp:1223: undefined reference to `sorgqr_'
libfaiss.a(VectorTransform.o): In function `faiss::OPQMatrix::train(long, float const*)':
/root/DVA/faiss/VectorTransform.cpp:603: undefined reference to `sgesvd_'
/root/DVA/faiss/VectorTransform.cpp:594: undefined reference to `sgesvd_'
libfaiss.a(VectorTransform.o): In function `faiss::PCAMatrix::train(long, float const*)':
/root/DVA/faiss/VectorTransform.cpp:284: undefined reference to `ssyev_'
/root/DVA/faiss/VectorTransform.cpp:289: undefined reference to `ssyev_'
collect2: error: ld returned 1 exit status
make: *** [tests/demo_ivfpq_indexing] Error 1


Feature: Excluding vectors from search

Imagine a set of 1000 vectors, each vector is linked to an entity(images in my case) and lets say i perform a knn lookup with k = 5, now i would like to examine the scenario of dynamically excluding vectors based on some criteria.

Is this even possible?

faiss::gpu::ToGpuClonerMultiple failed

Full log:

[email protected]:~/projects/faiss$ python benchs/bench_gpu_1bn.py Deep1B OPQ20_80,IVF262144,PQ20 -nnn 10 -R 2 -ngpu 1 -altadd -noptables -tempmem $[1024 * 1024 * 1024]
Preparing dataset Deep1B
sizes: B (1000000000, 96) Q (10000, 96) T (10000000, 96) gt (10000, 1)
cachefiles:
/data/bench_gpu_1bn/preproc_Deep1B_OPQ20_80.vectrans
/data/bench_gpu_1bn/cent_Deep1B_OPQ20_80,IVF262144.npy
/data/bench_gpu_1bn/Deep1B_OPQ20_80,IVF262144,PQ20.index
preparing resources for 1 GPUs
load /data/bench_gpu_1bn/preproc_Deep1B_OPQ20_80.vectrans
load /data/bench_gpu_1bn/Deep1B_OPQ20_80,IVF262144,PQ20.index
CPU index contains 1000000000 vectors, move to GPU
Copy CPU index to 2 sharded GPU indexes
dispatch to GPUs 0:0
python: GpuAutoTune.cpp:223: virtual faiss::Index* faiss::gpu::ToGpuClonerMultiple::clone_Index(const faiss::Index*): Assertion `index->ntotal == res->ntotal' failed.
Aborted (core dumped)

Python example api error

I encountered some problems while running the python example

CaydynMacbookPro:faiss caydyn$ python python/demo_auto_tune.py
load data
load GT
prepare criterion
Traceback (most recent call last):
File "python/demo_auto_tune.py", line 73, in
crit.set_groundtruth(None, gt)
File "/Users/caydyn/Development/faiss/python/swigfaiss.py", line 2053, in set_groundtruth
def set_groundtruth(self, *args): return _swigfaiss.AutoTuneCriterion_set_groundtruth(self, *args)
TypeError: AutoTuneCriterion_set_groundtruth() takes exactly 4 arguments (3 given)

CaydynMacbookPro:faiss caydyn$ python ./python/demo_ivfpq.py
loading database
Traceback (most recent call last):
File "./python/demo_ivfpq.py", line 43, in
gt_index.add(xb)
File "/Users/caydyn/Development/faiss/python/swigfaiss.py", line 1143, in add
def add(self, *args): return _swigfaiss.IndexFlat_add(self, *args)
TypeError: IndexFlat_add() takes exactly 3 arguments (2 given)

How to deal with such mistakes?

Tensorflow inception3 integration

I have got a set of 10k images and extract their features using the inception v3 model with tensorflow.
I use spatial.distance.cdist() from scipy to perform the knn, and i am wondering if faiss can be integrated or even replaced with my current approach.

Regards

faiss ids

Faiss assertion !"add_with_ids not implemented for this type of index" failed in virtual void faiss::Index::add_with_ids(faiss::Index::idx_t, const float*, const long int*) at Index.cpp:34

I read it, or do not understand, do you have this example

my test code:

import numpy as np
d = 64 # dimension
nb = 100000 # database size
nq = 10000 # nb of queries
np.random.seed(1234) # make reproducible
xb = np.random.random((nb, d)).astype('float32')
xb[:, 0] += np.arange(nb) / 1000.
xq = np.random.random((nq, d)).astype('float32')
xq[:, 0] += np.arange(nq) / 1000.

ids=(np.random.random(nb)).astype(int)
for i in range(0,100000):
ids[i]=long(i)

import faiss # make faiss available
index = faiss.IndexIVF(d) # build the index
print index.is_trained
index.add_with_ids(xb,ids)

print index.ntotal

k = 2 # we want to see 4 nearest neighbors
print xb[:1]
D, I = index.search(xb[:5], k) # sanity check
print I
print D

print "========================================="

print xq
D, I = index.search(xq, k) # actual search
print I[:5] # neighbors of the 5 first queries
print I[-5:] # neighbors of the 5 last queriess
print I[:1][0][0]

No Derivative license is incompatible with even academic use cases

Hi, I was interested in using FAISS but the no derivative clause in the license makes it impossible to use, e.g. if I edit the Makefile.inc am I creating a "derivative"?

The creative commons license allows redistribution but no-derivative clause prohibits it if it has been modified. Does that mean I cannot edit the config and then compile-package it in a docker container?

Also Creative Commons itself discourages its use with software. https://creativecommons.org/faq/#can-i-apply-a-creative-commons-license-to-software

How to save the result of training?

HI,
index.train(nb, xb); index.add(nb, xb);

 If I want to reuse the trained model, could it be trained again? I found nothing like saveModel or loadModel method in the codes.

Windows

Hi
Is there a windows port for this?
or
any effort done externally that provides instructions for how to build this on windows (using visual studio)?

Thanks

write_index error "don't know how to serialize this type of index"

Here is my code:

index = faiss.index_factory(d, "OPQ16_512,IVF1024,PQ16")
co = faiss.GpuClonerOptions()
co.useFloat16 = False
co.usePrecomputed = False
co.indicesOptions = faiss.INDICES_CPU
index = faiss.index_cpu_to_gpu(res, 0, index, co)

index.train(xt)
del xt

index.add(xb)

faiss.write_index(index, 'faiss.index')  # error happens here

index.add_with_ids add ids error

Faiss assertion !"add_with_ids not implemented for this type of index" failed in virtual void faiss::Index::add_with_ids(faiss::Index::idx_t, const float*, const long int*) at Index.cpp:34
ids is long Array, why error?

Serialization support for IndexIDMap

When trying to serialize a faiss.IndexIDMap instance, I get the following exception:

Faiss assertion !"don't know how to serialize this type of index"

Reading through your source, it looks like there simply isn't a conditional block that describes how to serialize an IndexIDMap instance to disk. How you would recommend persisting an IndexIDMap? Is there a timeline for supporting passing IndexIDMap instances to faiss.write_index?

Please add step-by-step instructions for how to compile the Python wrapper

It would be very helpful if you could add the actual commands that are necessary to compile the Python wrappers to the readme.

Right now the instructions assume that I know what I am doing :)

Step 2: Compiling the Python interface

The Python interface is provided via SWIG (Simple Wrapper and
Interface Generator) and an additional level of manual wrappers (in faiss.py).

SWIG generates two wrapper files: a Python file (swigfaiss.py) and a
C++ file that must be compiled to a dynamic library (_swigfaiss.so).

The C++ compilation to the dynamic library requires to set:

  • SHAREDFLAGS: system-specific flags to generate a dynamic library

  • PYTHONCFLAGS: include flags for Python

See the example makefile.inc's on how to set the flags.

I think I have figured out step 1 by myself:
swig -c++ -python swigfaiss.swig

which generates several swigfaiss_* .py and .c files. So I assume the final step that I am missing is to make the dynamic library. It would be very helpful if you could list the actual line that needs to be run.

Error when compiling the Python interface with GPU support

Hi,

I face a problem when I compile the Python interface with GPU support (step 1 and step 2 are ok). The error is reported as following:

cd ../python; swig -Doverride= -python -c++ -DGPU_WRAPPER -o ../python/swigfaiss_gpu_wrap.cxx ../swigfaiss.swig
../gpu/StandardGpuResources.h:67: Error: Syntax error in input(3).
make: *** [../python/swigfaiss_gpu_wrap.cxx] error 1

Anyone help me?

Support for incremental training

Which of the Index types support incremental addition of training data (i.e., in batches of 1k vectors)? The distinction between train and add is an additional source of confusion for me, especially because the tutorial makes it seem like the training operation should come first for Index types that support it? I'm used to an incremental addition of data flow, followed by a single training operation.

nvcc fatal : Don't know what to do with '/usr/lib/lapack/liblapack.so.3.0' make: *** [../python/_swigfaiss_gpu.so]

I encountered the following error when make [../python/_swigfaiss_gpu.so].

/usr/local/cuda-7.5//bin/nvcc -I.. -I /usr/local/cuda-7.5//targets/x86_64-linux/include/ -Xcompiler -fPIC -Xcudafe --diag_suppress=unrecognized_attribute -gencode arch=compute_35,code="compute_35" -gencode arch=compute_52,code="compute_52" -gencode arch=compute_52,code="compute_52" --std c++11 -lineinfo -ccbin g++ -DFAISS_USE_FLOAT16 -I/usr/include/python2.7/ -I/usr/lib64/python2.7/site-packages/numpy/core/include/ -shared -o ../python/_swigfaiss_gpu.so ../python/swigfaiss_gpu_wrap.cxx libgpufaiss.a ../libfaiss.a
-Xcompiler -fopenmp -lcublas
-Xlinker /usr/lib/libopenblas.so.0 /usr/lib/lapack/liblapack.so.3.0
nvcc fatal : Don't know what to do with '/usr/lib/lapack/liblapack.so.3.0'
make: *** [../python/_swigfaiss_gpu.so] Error 1

I use cuda-7.5. Is cuda-8.0 necessary for compiling gpu version?
System is 14.04.1-Ubuntu

error when use ivfpq

Faiss assertion err == cudaSuccess failed in char* faiss::gpu::StackDeviceMemory::Stack::getAlloc

I met a problem when I compiled tests/test_blas.cpp on ubuntu

I am a student and recently I am interested in image retrieval. When I downloaded the code and started to compile it, I have some trouble. I have already install Clapack, but it still have some errors.
If I use "make tests/test_blas" , I got the message below.
image
If I use "g++ tests/test_blas.cpp", I got another message below.
image
It really confused me. I wonder if you can help me to solve this trouble.

Segmentation fault on TITAN X (Pascal)

python benchs/bench_gpu_sift1m.py

load data
load GT
============ Exact search
add vectors to index
warmup
benchmark
k=1 0.804 s, R@1 0.9914
k=2 0.786 s, R@1 0.9935
k=4 0.907 s, R@1 0.9935
k=8 0.676 s, R@1 0.9935
k=16 0.912 s, R@1 0.9935
k=32 0.731 s, R@1 0.9935
k=64 0.900 s, R@1 0.9935
k=128 0.752 s, R@1 0.9935
k=256 0.986 s, R@1 0.9935
k=512 2.125 s, R@1 0.9935
k=1024 2.169 s, R@1 0.9935
============ Approximate search
train
WARNING clustering 100000 points to 4096 centroids: please provide at least 159744 training points
add vectors to index
Segmentation fault (core dumped)

k-means memory requirement?

What if I have matrix m x n and want to perform k-means clustering (k clusters) what will be memory requirement?

Research on L1 distance metric

Hi!

This project kicks ass to our https://github.com/src-d/kmcuda to major extent. I think the best way for us to move on is to integrate the missing (for source{d}) parts to Faiss. Those are:

  1. Proper K-means centroid initialization instead of random sampling. This includes some high level API improvement.
  2. arccos over the scalar product aka angular / proper "cosine" distance. CUDA does not notice it in terms of performance.
  3. Python 3 support (well, that should be easy with Swig).

If you agree with these, I will start making PRs. If you don't, I will have to incorporate Faiss inside kmcuda as the second non-free backend. Of course, I would prefer 1.

<command-line>:0:10: error: expected identifier before numeric constant utils.cpp:47:53: note: in expansion of macro ‘FINTEGER’

Hi!

when I compile the project,an error has occured as follow:
g++ -fPIC -m64 -Wall -g -O3 -msse4 -mpopcnt -fopenmp -Wno-sign-compare -Dnullptr=NULL -Doverride= -fopenmp -c hamming.cpp -o hamming.o
g++ -fPIC -m64 -Wall -g -O3 -msse4 -mpopcnt -fopenmp -Wno-sign-compare -Dnullptr=NULL -Doverride= -fopenmp -c utils.cpp -o utils.o -DFINTEGER=64
g++ -fPIC -m64 -Wall -g -O3 -msse4 -mpopcnt -fopenmp -Wno-sign-compare -Dnullptr=NULL -Doverride= -fopenmp -c IndexFlat.cpp -o IndexFlat.o
g++ -fPIC -m64 -Wall -g -O3 -msse4 -mpopcnt -fopenmp -Wno-sign-compare -Dnullptr=NULL -Doverride= -fopenmp -c IndexIVF.cpp -o IndexIVF.o
g++ -fPIC -m64 -Wall -g -O3 -msse4 -mpopcnt -fopenmp -Wno-sign-compare -Dnullptr=NULL -Doverride= -fopenmp -c IndexLSH.cpp -o IndexLSH.o
:0:10: error: expected identifier before numeric constant
utils.cpp:47:53: note: in expansion of macro ‘FINTEGER’
int sgemm_ (const char *transa, const char *transb, FINTEGER *m, FINTEGER *
^
:0:10: error: expected ‘,’ or ‘...’ before numeric constant
utils.cpp:47:53: note: in expansion of macro ‘FINTEGER’
int sgemm_ (const char *transa, const char *transb, FINTEGER *m, FINTEGER *
^
utils.cpp:54:24: error: ‘m’ was not declared in this scope
int sgeqrf_ (FINTEGER *m, FINTEGER *n, float *a, FINTEGER *lda,
^
utils.cpp:54:37: error: ‘n’ was not declared in this scope
int sgeqrf_ (FINTEGER *m, FINTEGER *n, float *a, FINTEGER *lda,
^
utils.cpp:54:40: error: expected primary-expression before ‘float’
int sgeqrf_ (FINTEGER *m, FINTEGER *n, float *a, FINTEGER *lda,
^
utils.cpp:54:60: error: ‘lda’ was not declared in this scope
int sgeqrf_ (FINTEGER *m, FINTEGER *n, float *a, FINTEGER *lda,
^
utils.cpp:55:18: error: expected primary-expression before ‘float’
float *tau, float *work, FINTEGER *lwork, FINTEGER *info);
^
utils.cpp:55:30: error: expected primary-expression before ‘float’
float *tau, float *work, FINTEGER *lwork, FINTEGER *info);
^
utils.cpp:55:53: error: ‘lwork’ was not declared in this scope
float *tau, float *work, FINTEGER *lwork, FINTEGER *info);
^
utils.cpp:55:70: error: ‘info’ was not declared in this scope
float *tau, float *work, FINTEGER *lwork, FINTEGER *info);
^
utils.cpp:55:74: error: expression list treated as compound expression in initializer [-fpermissive]
float *tau, float *work, FINTEGER *lwork, FINTEGER *info);
^
utils.cpp:57:23: error: ‘m’ was not declared in this scope
int sorgqr_(FINTEGER *m, FINTEGER *n, FINTEGER *k, float *a,
^
utils.cpp:57:36: error: ‘n’ was not declared in this scope
int sorgqr_(FINTEGER *m, FINTEGER *n, FINTEGER *k, float *a,
^
utils.cpp:57:49: error: ‘k’ was not declared in this scope

I've got a Ubuntu 14.04 system . Any idea what I am doing wrong?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.