Git Product home page Git Product logo

Comments (12)

s-jse avatar s-jse commented on June 12, 2024 1

One thing you can try is to index a smaller subset of the data first. If there are any issues with the installation for example, it will show up here.
For this, you can copy the first 10K lines of the collection_all.tsv file, e.g. head -n 100000 collection_all.tsv > collection_10k.tsv then run make index-wiki split=10k.

from wikichat.

deter3 avatar deter3 commented on June 12, 2024 1

I'm working with i7-12700F, 128GB RAM, Nvidia A2000 12GB.
I attempted to build index with the latest Wikipedia (as of Jan 15 2024) It ran for 50hrs to get to "Sorting Index" but then it got stuck.
The files with ids 0
1342 are still in \experiments\wikipedia_all\indexes\wikipedia.all.1bits

128RAM is not enough , I thought it should be enough , turned out I have to use 256RAM in EC2 .

Thanks for sharing. Did you build index with CPU? I thought build with GPU only use VRAM?

I did not build the wiki index but just using their index directly , 128GB ram is not enough for wikichat app up and running .

I did build up a index which is 220GB in size for 4bits, the RAM is 250GB . The actual text file is about 18Gb in pickle format . It took much longer time in saving the index one chunk by one chunk .

from wikichat.

s-jse avatar s-jse commented on June 12, 2024

Hi,

This step should be much faster, I have not seen it take this long.
Can you please provide more details about the size of the corpus you are indexing, and the specifications of the machine you are using (operating system, GPU, RAM, disk etc.)?

from wikichat.

StevenXHK avatar StevenXHK commented on June 12, 2024

I'm working with i7-12700F, 128GB RAM, Nvidia A2000 12GB.

I attempted to build index with the latest Wikipedia (as of Jan 15 2024)
It ran for ~50hrs to get to "Sorting Index" but then it got stuck.

The files with ids 0~1342 are still in \experiments\wikipedia_all\indexes\wikipedia.all.1bits

from wikichat.

s-jse avatar s-jse commented on June 12, 2024

How much disk space do you have left before indexing? The size of the index itself is quite large.

from wikichat.

StevenXHK avatar StevenXHK commented on June 12, 2024

The drive have 250+ GB left when sorting index.
I'll do another attempt with 2 x A2000 over the weekend and see if this can be resolved.

from wikichat.

StevenXHK avatar StevenXHK commented on June 12, 2024

Got this error with make index-wiki split=10k

#> Starting...
nranks = 2       num_gpus = 2    device=0
nranks = 2       num_gpus = 2    device=1
{
    "query_token_id": "[unused0]",
    "doc_token_id": "[unused1]",
    "query_token": "[Q]",
    "doc_token": "[D]",
    "ncells": null,
    "centroid_score_threshold": null,
    "ndocs": null,
    "load_index_with_mmap": false,
    "index_path": null,
    "nbits": 1,
    "kmeans_niters": 4,
    "resume": false,
    "similarity": "cosine",
    "bsize": 256,
    "accumsteps": 1,
    "lr": 3e-6,
    "maxsteps": 500000,
    "save_every": null,
    "warmup": null,
    "warmup_bert": null,
    "relu": false,
    "nway": 2,
    "use_ib_negatives": false,
    "reranker": false,
    "distillation_alpha": 1.0,
    "ignore_scores": false,
    "model_name": null,
    "query_maxlen": 32,
    "attend_to_mask_tokens": false,
    "interaction": "colbert",
    "dim": 128,
    "doc_maxlen": 140,
    "mask_punctuation": true,
    "checkpoint": "colbert-ir\/colbertv2.0",
    "triples": null,
    "collection": ".\/workdir\/en\/wikipedia_16_01_2024\/collection_10k.tsv",
    "queries": null,
    "index_name": "wikipedia.10k.1bits",
    "overwrite": false,
    "root": "\/mnt\/d\/linux\/Colbert\/experiments",
    "experiment": "wikipedia_10k",
    "index_root": null,
    "name": "2024-01\/24\/12.31.01",
    "rank": 0,
    "nranks": 2,
    "amp": true,
    "gpus": 2
}
...
  File "/home/user/anaconda3/envs/wikichat/lib/python3.8/site-packages/faiss/swigfaiss_avx2.py", line 12145, in index_cpu_to_gpu_multiple
    return _swigfaiss_avx2.index_cpu_to_gpu_multiple(provider, devices, index, options)
RuntimeError: Error in virtual void faiss::gpu::StandardGpuResourcesImpl::initializeForDevice(int) at /home/circleci/miniconda/conda-bld/faiss-pkg_1681998300314/work/faiss/gpu/StandardGpuResources.cpp:280: Error: 'err == cudaSuccess' failed: failed to cudaHostAlloc 268435456 bytes for CPU <-> GPU async copy buffer (error 2 out of memory)

So is this a hardware limit then?

from wikichat.

s-jse avatar s-jse commented on June 12, 2024

Probably. One more thing to try is to reduce the batch size here: https://github.com/stanford-oval/WikiChat/blob/main/ColBERT/colbert/indexer.py#L62.
The original ColBERT code sets this to 64.

from wikichat.

StevenXHK avatar StevenXHK commented on June 12, 2024

hmm same result..
image

This is weird though, it's not asking for that many VRAM and I can see there are still ~15GB left
The moment before out of memory error:
image

from wikichat.

StevenXHK avatar StevenXHK commented on June 12, 2024

Probably. One more thing to try is to reduce the batch size here: https://github.com/stanford-oval/WikiChat/blob/main/ColBERT/colbert/indexer.py#L62. The original ColBERT code sets this to 64.

Hi, would like to know your system specification for running index-wiki.
Apart from running with A100 GPU for 20hrs, what's other specs that you could share that is required / confirmed would work?

from wikichat.

deter3 avatar deter3 commented on June 12, 2024

I'm working with i7-12700F, 128GB RAM, Nvidia A2000 12GB.

I attempted to build index with the latest Wikipedia (as of Jan 15 2024) It ran for ~50hrs to get to "Sorting Index" but then it got stuck.

The files with ids 0~1342 are still in \experiments\wikipedia_all\indexes\wikipedia.all.1bits

128RAM is not enough , I thought it should be enough , turned out I have to use 256RAM in EC2 .

from wikichat.

StevenXHK avatar StevenXHK commented on June 12, 2024

I'm working with i7-12700F, 128GB RAM, Nvidia A2000 12GB.
I attempted to build index with the latest Wikipedia (as of Jan 15 2024) It ran for 50hrs to get to "Sorting Index" but then it got stuck.
The files with ids 0
1342 are still in \experiments\wikipedia_all\indexes\wikipedia.all.1bits

128RAM is not enough , I thought it should be enough , turned out I have to use 256RAM in EC2 .

Thanks for sharing. Did you build index with CPU? I thought build with GPU only use VRAM?

from wikichat.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.