Comments (10)
I ended up generating manually myself. Here is one with 30 iterations:
`cue = 'Question: Which figure of speech is used in this text? Sing, O goddess, the anger of Achilles son of Peleus, that brought countless ills upon the Achaeans. —Homer, The Iliad Context: N/A Options: (A) chiasmus (B) apostrophe Solution:'
I = tokenizer.get_vocab()
input_dict = tokenizer.encode_plus(cue, padding='max_length', return_attention_mask=True, return_tensors='pt')
input_ids = input_dict['input_ids']#.squeeze()
attention_mask = input_dict['attention_mask']#.squeeze()
image_ids = test_set_unpacked[0]['image_ids'].to(torch.float32).unsqueeze(0)
labels = 0*input_ids
predicted = []
probe = {'input_ids':input_ids, 'attention_mask':attention_mask, 'image_ids':image_ids, 'labels':input_ids}
for i in range(30):
res = model(**probe)
report = int(torch.argmax(res[1][0,len(predicted)]))
probe['labels'][0, len(predicted)] = report
predicted.append(report)
print(tokenizer.decode(report))`
Note that it's deterministic but you can easily modify that...
from mm-cot.
@gianfrancodemarco On custom data? As far as I'm aware, the inference scripts only support inference on ground truth data (=evaluation).
For "real" inference, T5.generate() is needed which currently only supports text inputs. But as I said, I might be wrong.
from mm-cot.
@xaiguy my previous comment wasn't referred to inferring on custom data, but was about reproducing their experiment.
However, we managed to make it support also inference on custom data. This was needed firstly because i don't think using Seq2SeqTrainer to generate text is a good practice, and secondly because it would need labels (which in my case I don't have).
I ended up overriding the _prepare_encoder_decoder_kwargs_for_generation method so that it ignores the image_ids input, and the prepare_inputs_for_generation to make it also return the image ids.
I don't know if this is the best way to do this, but now it is also working with .generate()
from mm-cot.
@sasaadi You can find it here https://github.com/gianfrancodemarco/mm-cot/blob/9e84c2ed2ef6921a56f28911a938b78453496655/src/models/t5_multimodal_generation/model.py#L202
from mm-cot.
bump
from mm-cot.
What do you mean "it's currently not supported yet"? I've managed to use it for inference (but only for the rationale, at the moment)
from mm-cot.
@gianfrancodemarco Thanks, that sounds a lot simpler than what I was trying to do! Were you able to confirm that it's working as intended? For example by comparing results with and without image data.
from mm-cot.
Same issue that I met. Is there any practical solution to this problem? @xaiguy
from mm-cot.
@xaiguy i don't think i've conducted that exact experiment, but you can test it if you want!
from mm-cot.
@xaiguy my previous comment wasn't referred to inferring on custom data, but was about reproducing their experiment. However, we managed to make it support also inference on custom data. This was needed firstly because i don't think using Seq2SeqTrainer to generate text is a good practice, and secondly because it would need labels (which in my case I don't have).
I ended up overriding the _prepare_encoder_decoder_kwargs_for_generation method so that it ignores the image_ids input, and the prepare_inputs_for_generation to make it also return the image ids.
I don't know if this is the best way to do this, but now it is also working with .generate()
Can you please share your piece of code for this?
thank you
from mm-cot.
Related Issues (20)
- Out of memory during eval but not train? HOT 16
- How are the vision features generated here ? How to view detr.npy and clip.npy images HOT 1
- typo in utils.prompt line 104 and 106 HOT 1
- Implementation Mm-cot HOT 1
- Question: PC requirements
- How to train
- Question about two stages training? HOT 1
- I can't find main_central.py. HOT 1
- ImportError: cannot import name 'Conv2dSame' from 'timm.models.layers' (unknown location) HOT 5
- [17:28:39] [Model]: Loading declare-lab/flan-alpaca-large... HOT 3
- Where is Gold Rationale from? HOT 1
- "blip2_vicuna_instruct" can't find lead to nonetype HOT 1
- Request for Release of Multimodal-CoT Large 738M Model HOT 3
- While running ‵extract_caption.py`, raise many garbled text. So will you put the models in `https://huggingface.co/Salesforce/instructblip-vicuna-7b/tree/main` the `llm` folder? HOT 1
- ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (`image_ids` in this case) have excessive nesting (inputs type `list` where type `int` is expected). HOT 1
- OverflowError: out of range integral type conversion attempted HOT 3
- Where is the main_central.py
- Can not train on GPU.
- Question on fine-tuning time HOT 1
- How to use the mm-cot frame as a utility library through local LLM? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mm-cot.