yushi-hu / tifa Goto Github PK

View Code? Open in Web Editor NEW

133.0 3.0 8.0 6.23 MB

TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering

Home Page: https://tifa-benchmark.github.io/

License: Apache License 2.0

Python 88.77% Jupyter Notebook 11.23%

image-to-text large-language-models text-to-image visual-question-answering

tifa's Issues

Fine-tuned Flan-T5 release

Hi,

thanks for the amazing work. Could you maybe provide an estimated time for the fine-tuned Flan-T5 release?

Thanks a lot and looking forward to try it out!

Potentially Missing Questions

In the question_gen.py script, it looks like there are 12 in-context examples. In the paper, it says there are 15 examples. Any chance there are 2 missing?

what is the "ast_indexer" and how to change the path of it?

Thanks for your great work!
when I runing the code it will tell me
modelscope - INFO - Loading ast index from /home/nudt/.cache/modelscope/ast_indexer

however, this path is not convenient for me, how can I make it load from the project directory?

Errors in dictionary handling

In the function tifa_score_single, there are these lines:

        if question_answer_pair['question'] not in question_logs:
            question_logs[question_answer_pair['question']] = question_answer_pair
        choices=question_answer_pair['choices']

Consider changing this to:

        if question_answer_pair['question'] not in question_logs:
            question_logs[question_answer_pair['question']] = copy.deepcopy(question_answer_pair)
        choices=question_answer_pair['choices']

Otherwise, whenever you run
result = tifa_score_single(vqa_model, filtered_questions, img_path)
you are changing the original filtered_questions, and result contain a reference to filtered_questions. Wierd things would happen. For example, if you make a new call with the same filtered_questions, the result from the previous call would be changed.

what's mplug-large

Hello, when I used the script you provided to test, "loading mplug-large" was displayed, but it failed. What is mplug-large? Is there another way I can solve this problem?

what's the minimum gpu to run 1024*1024 size image?

I tried running tifa_test.py on single image with your drawbench_8.jpg (with llama2) and it worked. When i tried running it with my own image(size of 1150*750) and it returned OOM issue. I'm on L4 gpu(24gb vram).
To bypass the issue, I emptied the gpu cache after after llama tasks and ran tifa_score_single.

what your suggestion for minimal vram?

vqa_models misjudge the spatial relationship in most of the cases

Hello, thank you for your repo.

When I test some cases with tifa, I find that the model totally confused right and left, below is an example:

what I generate is as follows:

The result by mPLUG is {'id': 'paintskill_29', 'caption': 'a photo of bird and boat; boat is right to bird', 'question': 'is the boat right to or left to the bird?', 'choices': ['right to', 'left to', 'in front of', 'behind'], 'answer': 'right to', 'element_type': 'spatial', 'element': 'right to', 'free_form_vqa': 'left', 'multiple_choice_vqa': 'left to', 'scores': 0, 'pred_image_path': '/share/project/yhy/project/frag/image_editing_pipeline/baseline/LayoutLLM_T2I_main/auto_RAIG_output/tifa/1111.png'}

Similarly, for the image:

The result is
{'id': 'paintskill_14', 'caption': 'a photo of bike and chair; chair is below bike', 'question': 'is the chair below or above the bike?', 'choices': ['below', 'above', 'next to', 'behind'], 'answer': 'below', 'element_type': 'spatial', 'element': 'below', 'free_form_vqa': 'above', 'multiple_choice_vqa': 'above', 'scores': 0, 'pred_image_path': '/share/project/yhy/project/frag/image_editing_pipeline/baseline/LayoutLLM_T2I_main/auto_RAIG_output/tifa/1072.png'}

Is there something wrong?
Can you help me verify this result to test whether it is the bug of my code? (Actually I barely changed the repository code)

Thank you in advance.

MPLUG model not available

Hi,
It looks like the MPLUG and the ofa-large models are no longer available on HuggingFace.
Can you re-upload them or publish the checkpoints somewhere else?
As MPLUG performed best in your experiments, it would be great to use that model!

Thanks is advance :)

OpenAI API update

Hi
Thank you for your great work. I try to use your repo but so far run into problems when trying to reach openai servers.

Traceback (most recent call last):
  File "/home/anasrezklinux/anas_april/visual_story.py", line 1133, in <module>
    custom_diffusion_inference([character_1, character_2], step, lr)
  File "/home/anasrezklinux/anas_april/visual_story.py", line 512, in custom_diffusion_inference
    TIFA_metric_score, DALL_eval_score, ViTS_16_DINO_embeddings = score_images(image_path, real_photo_path_list, prompt)
  File "/home/anasrezklinux/anas_april/visual_story.py", line 53, in score_images
    return TIFA_metric_score(prompt, image_path),DALL_eval_score(prompt, image_path),[ViTS_16_DINO_embeddings(image_path, real_image_path) for real_image_path in real_image_paths]
  File "/home/anasrezklinux/anas_april/compile_story.py", line 86, in TIFA_metric_score
    gpt3_questions = get_question_and_answers(prompt)
  File "/home/anasrezklinux/anas_april/tifa/tifascore/question_gen.py", line 547, in get_question_and_answers
    resp = openai_completion(this_prompt)
  File "/home/anasrezklinux/anas_april/tifa/tifascore/openai_api.py", line 6, in openai_completion
    resp =  openai.ChatCompletion.create(
  File "/home/anasrezklinux/anas_april/venv/lib/python3.10/site-packages/openai/lib/_old_api.py", line 39, in __call__
    raise APIRemovedInV1(symbol=self._symbol)
openai.lib._old_api.APIRemovedInV1: 

You tried to access openai.ChatCompletion, but this is no longer supported in openai>=1.0.0 - see the README at https://github.com/openai/openai-python for the API.

You can run `openai migrate` to automatically upgrade your codebase to use the 1.0.0 interface. 

Alternatively, you can pin your installation to the old version, e.g. `pip install openai==0.28`

A detailed migration guide is available here: https://github.com/openai/openai-python/discussions/742

for now I will revert to openai==0.28 , yet, it would be great if you could update this repo :)

yushi-hu / tifa Goto Github PK

tifa's Issues

Fine-tuned Flan-T5 release

Potentially Missing Questions

what is the "ast_indexer" and how to change the path of it?

Errors in dictionary handling

what's mplug-large

what's the minimum gpu to run 1024*1024 size image?

vqa_models misjudge the spatial relationship in most of the cases

MPLUG model not available

OpenAI API update

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent