Comments (6)
I've just managed to reproduce the prediction step.
I've had to move every tensor to the GPU, because they were defaulting on the CPU. I don't know how it worked in the original research work...
However, the memory usage problem is due to the fact that when the predict method is called, each of the predicted tensor is kept in memory, and each of them is very heavy.
To solve this, i modified the inference procedure to loop on a small batches of data and decode them (the decoded version is a lot smaller).
This problem is also caused by the fact that all of the data (included the train set, 3x the size of the eval set) is loaded, even if you would only need the eval set.
from mm-cot.
I've just managed to reproduce the prediction step. I've had to move every tensor to the GPU, because they were defaulting on the CPU. I don't know how it worked in the original research work... However, the memory usage problem is due to the fact that when the predict method is called, each of the predicted tensor is kept in memory, and each of them is very heavy. To solve this, i modified the inference procedure to loop on a small batches of data and decode them (the decoded version is a lot smaller).
This problem is also caused by the fact that all of the data (included the train set, 3x the size of the eval set) is loaded, even if you would only need the eval set.
I have encountered the same problem, and RAM 125.50GB is also not enough. I would like to know which data you are storing on the GPU. Could you please provide more detailed modifications? Thank you very much.
from mm-cot.
I've just managed to reproduce the prediction step. I've had to move every tensor to the GPU, because they were defaulting on the CPU. I don't know how it worked in the original research work... However, the memory usage problem is due to the fact that when the predict method is called, each of the predicted tensor is kept in memory, and each of them is very heavy. To solve this, i modified the inference procedure to loop on a small batches of data and decode them (the decoded version is a lot smaller).
This problem is also caused by the fact that all of the data (included the train set, 3x the size of the eval set) is loaded, even if you would only need the eval set.I have encountered the same problem, and RAM 125.50GB is also not enough. I would like to know which data you are storing on the GPU. Could you please provide more detailed modifications? Thank you very much.
You can find them here and in the rest of the repo: https://github.com/gianfrancodemarco/mm-cot/blob/main/src/data/scienceQA/dataset_std.py
from mm-cot.
I've just managed to reproduce the prediction step. I've had to move every tensor to the GPU, because they were defaulting on the CPU. I don't know how it worked in the original research work... However, the memory usage problem is due to the fact that when the predict method is called, each of the predicted tensor is kept in memory, and each of them is very heavy. To solve this, i modified the inference procedure to loop on a small batches of data and decode them (the decoded version is a lot smaller).
This problem is also caused by the fact that all of the data (included the train set, 3x the size of the eval set) is loaded, even if you would only need the eval set.I have encountered the same problem, and RAM 125.50GB is also not enough. I would like to know which data you are storing on the GPU. Could you please provide more detailed modifications? Thank you very much.
You can find them here and in the rest of the repo: https://github.com/gianfrancodemarco/mm-cot/blob/main/src/data/scienceQA/dataset_std.py
I don't know why it doesn't work for me. I replaced the entire class ScienceQADatasetStd and ScienceQADatasetImg with the one you provided. But the same problem occurred
from mm-cot.
I've just managed to reproduce the prediction step. I've had to move every tensor to the GPU, because they were defaulting on the CPU. I don't know how it worked in the original research work... However, the memory usage problem is due to the fact that when the predict method is called, each of the predicted tensor is kept in memory, and each of them is very heavy. To solve this, i modified the inference procedure to loop on a small batches of data and decode them (the decoded version is a lot smaller).
This problem is also caused by the fact that all of the data (included the train set, 3x the size of the eval set) is loaded, even if you would only need the eval set.I have encountered the same problem, and RAM 125.50GB is also not enough. I would like to know which data you are storing on the GPU. Could you please provide more detailed modifications? Thank you very much.
You can find them here and in the rest of the repo: https://github.com/gianfrancodemarco/mm-cot/blob/main/src/data/scienceQA/dataset_std.py
I am studying the fork you provided. Could you provide the running configuration of the scienceQA dataset about https://github.com/gianfrancodemarco/mm-cot/blob/main/experiments/run_experiments.py
Looking forward to your reply. Thank you very much.
from mm-cot.
@zhongfansun i don't think you need to use run_experiments.py. You'll find the relevant configurations here: https://github.com/gianfrancodemarco/mm-cot/blob/main/.vscode/launch.json
from mm-cot.
Related Issues (20)
- How are the vision features generated here ? How to view detr.npy and clip.npy images HOT 1
- typo in utils.prompt line 104 and 106 HOT 1
- Implementation Mm-cot HOT 1
- Question: PC requirements
- How to train
- Question about two stages training? HOT 1
- I can't find main_central.py. HOT 1
- ImportError: cannot import name 'Conv2dSame' from 'timm.models.layers' (unknown location) HOT 5
- [17:28:39] [Model]: Loading declare-lab/flan-alpaca-large... HOT 3
- Where is Gold Rationale from? HOT 1
- "blip2_vicuna_instruct" can't find lead to nonetype HOT 1
- Request for Release of Multimodal-CoT Large 738M Model HOT 3
- While running ‵extract_caption.py`, raise many garbled text. So will you put the models in `https://huggingface.co/Salesforce/instructblip-vicuna-7b/tree/main` the `llm` folder? HOT 1
- ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (`image_ids` in this case) have excessive nesting (inputs type `list` where type `int` is expected). HOT 1
- OverflowError: out of range integral type conversion attempted HOT 3
- Where is the main_central.py
- Can not train on GPU.
- Question on fine-tuning time HOT 1
- How to use the mm-cot frame as a utility library through local LLM? HOT 1
- OverflowError: can't convert negative int to unsigned HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mm-cot.