I was trying to use the model for inference, but it's currently not supported yet, rig

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

T5ForMultimodalGeneration Inference about mm-cot HOT 10 OPEN

amazon-science commented on June 21, 2024

T5ForMultimodalGeneration Inference

from mm-cot.

Comments (10)

kshabahang commented on June 21, 2024 2

I ended up generating manually myself. Here is one with 30 iterations:

`cue = 'Question: Which figure of speech is used in this text? Sing, O goddess, the anger of Achilles son of Peleus, that brought countless ills upon the Achaeans. —Homer, The Iliad Context: N/A Options: (A) chiasmus (B) apostrophe Solution:'
I = tokenizer.get_vocab()

input_dict = tokenizer.encode_plus(cue, padding='max_length', return_attention_mask=True, return_tensors='pt')
input_ids = input_dict['input_ids']#.squeeze()
attention_mask = input_dict['attention_mask']#.squeeze()
image_ids = test_set_unpacked[0]['image_ids'].to(torch.float32).unsqueeze(0)
labels = 0*input_ids

predicted = []
probe = {'input_ids':input_ids, 'attention_mask':attention_mask, 'image_ids':image_ids, 'labels':input_ids}

for i in range(30):
res = model(**probe)
report = int(torch.argmax(res[1][0,len(predicted)]))
probe['labels'][0, len(predicted)] = report
predicted.append(report)
print(tokenizer.decode(report))`

Note that it's deterministic but you can easily modify that...

from mm-cot.

xaiguy commented on June 21, 2024 1

@gianfrancodemarco On custom data? As far as I'm aware, the inference scripts only support inference on ground truth data (=evaluation).

For "real" inference, T5.generate() is needed which currently only supports text inputs. But as I said, I might be wrong.

from mm-cot.

gianfrancodemarco commented on June 21, 2024 1

@xaiguy my previous comment wasn't referred to inferring on custom data, but was about reproducing their experiment.
However, we managed to make it support also inference on custom data. This was needed firstly because i don't think using Seq2SeqTrainer to generate text is a good practice, and secondly because it would need labels (which in my case I don't have).

I ended up overriding the _prepare_encoder_decoder_kwargs_for_generation method so that it ignores the image_ids input, and the prepare_inputs_for_generation to make it also return the image ids.

I don't know if this is the best way to do this, but now it is also working with .generate()

from mm-cot.

gianfrancodemarco commented on June 21, 2024 1

@sasaadi You can find it here https://github.com/gianfrancodemarco/mm-cot/blob/9e84c2ed2ef6921a56f28911a938b78453496655/src/models/t5_multimodal_generation/model.py#L202

from mm-cot.

kshabahang commented on June 21, 2024

bump

from mm-cot.

gianfrancodemarco commented on June 21, 2024

What do you mean "it's currently not supported yet"? I've managed to use it for inference (but only for the rationale, at the moment)

from mm-cot.

xaiguy commented on June 21, 2024

@gianfrancodemarco Thanks, that sounds a lot simpler than what I was trying to do! Were you able to confirm that it's working as intended? For example by comparing results with and without image data.

from mm-cot.

WeixuanXiong commented on June 21, 2024

Same issue that I met. Is there any practical solution to this problem? @xaiguy

from mm-cot.

gianfrancodemarco commented on June 21, 2024

@xaiguy i don't think i've conducted that exact experiment, but you can test it if you want!

from mm-cot.

sasaadi commented on June 21, 2024

@xaiguy my previous comment wasn't referred to inferring on custom data, but was about reproducing their experiment. However, we managed to make it support also inference on custom data. This was needed firstly because i don't think using Seq2SeqTrainer to generate text is a good practice, and secondly because it would need labels (which in my case I don't have).

I ended up overriding the _prepare_encoder_decoder_kwargs_for_generation method so that it ignores the image_ids input, and the prepare_inputs_for_generation to make it also return the image ids.

I don't know if this is the best way to do this, but now it is also working with .generate()

Can you please share your piece of code for this?
thank you

from mm-cot.

T5ForMultimodalGeneration Inference about mm-cot HOT 10 OPEN

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent