Comments (9)
Why -ngpu 1
?
from faiss.
Because i want run faiss on first GPU.
from faiss.
Why -R 2
then?
from faiss.
Oops, my bad. Removed that flag and got this:
[email protected]:~/projects/faiss$ python benchs/bench_gpu_1bn.py Deep1B OPQ20_80,IVF262144,PQ20 -nnn 10 -ngpu 4 -altadd -noptables
Preparing dataset Deep1B
sizes: B (1000000000, 96) Q (10000, 96) T (10000000, 96) gt (10000, 1)
cachefiles:
/data/bench_gpu_1bn/preproc_Deep1B_OPQ20_80.vectrans
/data/bench_gpu_1bn/cent_Deep1B_OPQ20_80,IVF262144.npy
/data/bench_gpu_1bn/Deep1B_OPQ20_80,IVF262144,PQ20.index
preparing resources for 4 GPUs
load /data/bench_gpu_1bn/preproc_Deep1B_OPQ20_80.vectrans
load /data/bench_gpu_1bn/Deep1B_OPQ20_80,IVF262144,PQ20.index
CPU index contains 1000000000 vectors, move to GPU
copying loaded index to GPUs
IndexShards shard 0 indices 0:250000000
IndexIVFPQ size 250000000 -> GpuIndexIVFPQ indicesOptions=0 usePrecomputed=0 useFloat16=0 reserveVecs=0
IndexShards shard 1 indices 250000000:500000000
IndexIVFPQ size 250000000 -> GpuIndexIVFPQ indicesOptions=0 usePrecomputed=0 useFloat16=0 reserveVecs=0
IndexShards shard 2 indices 500000000:750000000
IndexIVFPQ size 250000000 -> GpuIndexIVFPQ indicesOptions=0 usePrecomputed=0 useFloat16=0 reserveVecs=0
IndexShards shard 3 indices 750000000:1000000000
IndexIVFPQ size 250000000 -> GpuIndexIVFPQ indicesOptions=0 usePrecomputed=0 useFloat16=0 reserveVecs=0
move to GPU done in 203.708 s
search...
0/10000 (0.466 s) Faiss assertion listOffset < listIndices.size() failed in void faiss::gpu::ivfOffsetToUserIndex(long int*, int, int, int, const std::vector<std::vector >&) at impl/RemapIndices.cpp:40Aborted (core dumped)
With -ngpu 3
script works normal:
[email protected]:~/projects/faiss$ python benchs/bench_gpu_1bn.py Deep1B OPQ20_80,IVF262144,PQ20 -nnn 10 -ngpu 3 -altadd -noptables
Preparing dataset Deep1B
sizes: B (1000000000, 96) Q (10000, 96) T (10000000, 96) gt (10000, 1)
cachefiles:
/data/bench_gpu_1bn/preproc_Deep1B_OPQ20_80.vectrans
/data/bench_gpu_1bn/cent_Deep1B_OPQ20_80,IVF262144.npy
/data/bench_gpu_1bn/Deep1B_OPQ20_80,IVF262144,PQ20.index
preparing resources for 3 GPUs
load /data/bench_gpu_1bn/preproc_Deep1B_OPQ20_80.vectrans
load /data/bench_gpu_1bn/Deep1B_OPQ20_80,IVF262144,PQ20.index
CPU index contains 1000000000 vectors, move to GPU
copying loaded index to GPUs
IndexShards shard 0 indices 0:333333333
IndexIVFPQ size 333333333 -> GpuIndexIVFPQ indicesOptions=0 usePrecomputed=0 useFloat16=0 reserveVecs=0
IndexShards shard 1 indices 333333333:666666666
IndexIVFPQ size 333333333 -> GpuIndexIVFPQ indicesOptions=0 usePrecomputed=0 useFloat16=0 reserveVecs=0
IndexShards shard 2 indices 666666666:1000000000
IndexIVFPQ size 333333334 -> GpuIndexIVFPQ indicesOptions=0 usePrecomputed=0 useFloat16=0 reserveVecs=0
move to GPU done in 130.713 s
search...
0/10000 (0.561 s) probe=1 : 1.075 s 1-R@1: 0.2334 1-R@10: 0.3384
0/10000 (0.008 s) probe=2 : 0.513 s 1-R@1: 0.3054 1-R@10: 0.4666
0/10000 (0.015 s) probe=4 : 0.523 s 1-R@1: 0.3704 1-R@10: 0.5907
0/10000 (0.015 s) probe=8 : 0.558 s 1-R@1: 0.4193 1-R@10: 0.6998
0/10000 (0.015 s) probe=16 : 0.639 s 1-R@1: 0.4506 1-R@10: 0.7785
0/10000 (0.012 s) probe=32 : 0.780 s 1-R@1: 0.4708 1-R@10: 0.8337
0/10000 (0.018 s) probe=64 : 1.076 s 1-R@1: 0.4810 1-R@10: 0.8693
0/10000 (0.016 s) probe=128: 1.608 s 1-R@1: 0.4858 1-R@10: 0.8863
0/10000 (0.020 s) probe=256: 2.718 s 1-R@1: 0.4895 1-R@10: 0.8962
And one more question, do I understand correctly that the flag tempmem limits the maximum possible memory on a single GPU?
from faiss.
tempmem is used to control the temporary memory scratch space at use on the GPU. It should ideally be at least 1 GB at all times.
https://github.com/facebookresearch/faiss/wiki/Faiss-on-the-GPU
The database in question here uses 20 bytes * 1 billion = 20 GB of memory and thus cannot fit on a single GPU, or likely not 2 GPUs either.
As to why it is working on 3 but not 4, this could be a problem with the sharding copy? @mdouze
from faiss.
I can't repro the issue, tested on 4K40 and 4TitanX. Please provide more context (ldd output, nvidia-smi output, gdb stacktrace).
from faiss.
I will try repro issue this weekend with 4K80.
from faiss.
Could you try with the current version? It has better low-mem GPU support.
from faiss.
Cant repro bug, thanks you for update.
from faiss.
Related Issues (20)
- Is it possible to lazy load index from disk? HOT 1
- Binary embeddings score normalization HOT 1
- No conda package for faiss-cpu 1.8.0 for osx-64 on pytorch channel HOT 5
- Static library libfaiss_gpu.a not installed HOT 1
- faiss_gpu object is not linked to static library libfaiss.a HOT 3
- Error when building static library for AVX2 and GPU HOT 2
- Cannot debug similarity search HOT 1
- Add a tutorial for IndexHNSW HOT 3
- Segfault error on faiss.IndexIVFFlat().train HOT 1
- knn_gpu should use raft when raft is compiled in HOT 2
- ImportError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found HOT 1
- Remove lapack dependency? HOT 1
- Faiss imported after Torch leads to segfault HOT 2
- Suggestions on implementing multi-scale quantization HOT 3
- The similarity results obtained from the index.faiss file are significantly different from those obtained from previous versions HOT 1
- inquiry related to DistanceComputer HOT 2
- Failed to install via poetry HOT 1
- Update the raft handle through StandardGpuResourcesImpl::setDefaultStream
- [Feature Request] GPU indices Provide Interface to Access Resource HOT 2
- faiss index and retriever not able to save HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from faiss.