Git Product home page Git Product logo

Comments (9)

mdouze avatar mdouze commented on June 7, 2024

Why -ngpu 1 ?

from faiss.

sharthZ23 avatar sharthZ23 commented on June 7, 2024

Because i want run faiss on first GPU.

from faiss.

mdouze avatar mdouze commented on June 7, 2024

Why -R 2 then?

from faiss.

sharthZ23 avatar sharthZ23 commented on June 7, 2024

Oops, my bad. Removed that flag and got this:

[email protected]:~/projects/faiss$ python benchs/bench_gpu_1bn.py Deep1B OPQ20_80,IVF262144,PQ20 -nnn 10 -ngpu 4 -altadd -noptables
Preparing dataset Deep1B
sizes: B (1000000000, 96) Q (10000, 96) T (10000000, 96) gt (10000, 1)
cachefiles:
/data/bench_gpu_1bn/preproc_Deep1B_OPQ20_80.vectrans
/data/bench_gpu_1bn/cent_Deep1B_OPQ20_80,IVF262144.npy
/data/bench_gpu_1bn/Deep1B_OPQ20_80,IVF262144,PQ20.index
preparing resources for 4 GPUs
load /data/bench_gpu_1bn/preproc_Deep1B_OPQ20_80.vectrans
load /data/bench_gpu_1bn/Deep1B_OPQ20_80,IVF262144,PQ20.index
CPU index contains 1000000000 vectors, move to GPU
copying loaded index to GPUs
IndexShards shard 0 indices 0:250000000
IndexIVFPQ size 250000000 -> GpuIndexIVFPQ indicesOptions=0 usePrecomputed=0 useFloat16=0 reserveVecs=0
IndexShards shard 1 indices 250000000:500000000
IndexIVFPQ size 250000000 -> GpuIndexIVFPQ indicesOptions=0 usePrecomputed=0 useFloat16=0 reserveVecs=0
IndexShards shard 2 indices 500000000:750000000
IndexIVFPQ size 250000000 -> GpuIndexIVFPQ indicesOptions=0 usePrecomputed=0 useFloat16=0 reserveVecs=0
IndexShards shard 3 indices 750000000:1000000000
IndexIVFPQ size 250000000 -> GpuIndexIVFPQ indicesOptions=0 usePrecomputed=0 useFloat16=0 reserveVecs=0
move to GPU done in 203.708 s
search...
0/10000 (0.466 s) Faiss assertion listOffset < listIndices.size() failed in void faiss::gpu::ivfOffsetToUserIndex(long int*, int, int, int, const std::vector<std::vector >&) at impl/RemapIndices.cpp:40Aborted (core dumped)

With -ngpu 3 script works normal:

[email protected]:~/projects/faiss$ python benchs/bench_gpu_1bn.py Deep1B OPQ20_80,IVF262144,PQ20 -nnn 10 -ngpu 3 -altadd -noptables
Preparing dataset Deep1B
sizes: B (1000000000, 96) Q (10000, 96) T (10000000, 96) gt (10000, 1)
cachefiles:
/data/bench_gpu_1bn/preproc_Deep1B_OPQ20_80.vectrans
/data/bench_gpu_1bn/cent_Deep1B_OPQ20_80,IVF262144.npy
/data/bench_gpu_1bn/Deep1B_OPQ20_80,IVF262144,PQ20.index
preparing resources for 3 GPUs
load /data/bench_gpu_1bn/preproc_Deep1B_OPQ20_80.vectrans
load /data/bench_gpu_1bn/Deep1B_OPQ20_80,IVF262144,PQ20.index
CPU index contains 1000000000 vectors, move to GPU
copying loaded index to GPUs
IndexShards shard 0 indices 0:333333333
IndexIVFPQ size 333333333 -> GpuIndexIVFPQ indicesOptions=0 usePrecomputed=0 useFloat16=0 reserveVecs=0
IndexShards shard 1 indices 333333333:666666666
IndexIVFPQ size 333333333 -> GpuIndexIVFPQ indicesOptions=0 usePrecomputed=0 useFloat16=0 reserveVecs=0
IndexShards shard 2 indices 666666666:1000000000
IndexIVFPQ size 333333334 -> GpuIndexIVFPQ indicesOptions=0 usePrecomputed=0 useFloat16=0 reserveVecs=0
move to GPU done in 130.713 s
search...
0/10000 (0.561 s) probe=1 : 1.075 s 1-R@1: 0.2334 1-R@10: 0.3384
0/10000 (0.008 s) probe=2 : 0.513 s 1-R@1: 0.3054 1-R@10: 0.4666
0/10000 (0.015 s) probe=4 : 0.523 s 1-R@1: 0.3704 1-R@10: 0.5907
0/10000 (0.015 s) probe=8 : 0.558 s 1-R@1: 0.4193 1-R@10: 0.6998
0/10000 (0.015 s) probe=16 : 0.639 s 1-R@1: 0.4506 1-R@10: 0.7785
0/10000 (0.012 s) probe=32 : 0.780 s 1-R@1: 0.4708 1-R@10: 0.8337
0/10000 (0.018 s) probe=64 : 1.076 s 1-R@1: 0.4810 1-R@10: 0.8693
0/10000 (0.016 s) probe=128: 1.608 s 1-R@1: 0.4858 1-R@10: 0.8863
0/10000 (0.020 s) probe=256: 2.718 s 1-R@1: 0.4895 1-R@10: 0.8962

And one more question, do I understand correctly that the flag tempmem limits the maximum possible memory on a single GPU?

from faiss.

wickedfoo avatar wickedfoo commented on June 7, 2024

tempmem is used to control the temporary memory scratch space at use on the GPU. It should ideally be at least 1 GB at all times.

https://github.com/facebookresearch/faiss/wiki/Faiss-on-the-GPU

The database in question here uses 20 bytes * 1 billion = 20 GB of memory and thus cannot fit on a single GPU, or likely not 2 GPUs either.

As to why it is working on 3 but not 4, this could be a problem with the sharding copy? @mdouze

from faiss.

mdouze avatar mdouze commented on June 7, 2024

I can't repro the issue, tested on 4K40 and 4TitanX. Please provide more context (ldd output, nvidia-smi output, gdb stacktrace).

from faiss.

sharthZ23 avatar sharthZ23 commented on June 7, 2024

I will try repro issue this weekend with 4K80.

from faiss.

mdouze avatar mdouze commented on June 7, 2024

Could you try with the current version? It has better low-mem GPU support.

from faiss.

sharthZ23 avatar sharthZ23 commented on June 7, 2024

Cant repro bug, thanks you for update.

from faiss.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.