yushi-hu / tifa Goto Github PK
View Code? Open in Web Editor NEWTIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering
Home Page: https://tifa-benchmark.github.io/
License: Apache License 2.0
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering
Home Page: https://tifa-benchmark.github.io/
License: Apache License 2.0
Hi,
thanks for the amazing work. Could you maybe provide an estimated time for the fine-tuned Flan-T5 release?
Thanks a lot and looking forward to try it out!
In the question_gen.py
script, it looks like there are 12 in-context examples. In the paper, it says there are 15 examples. Any chance there are 2 missing?
Thanks for your great work!
when I runing the code it will tell me
modelscope - INFO - Loading ast index from /home/nudt/.cache/modelscope/ast_indexer
however, this path is not convenient for me, how can I make it load from the project directory?
In the function tifa_score_single
, there are these lines:
if question_answer_pair['question'] not in question_logs:
question_logs[question_answer_pair['question']] = question_answer_pair
choices=question_answer_pair['choices']
Consider changing this to:
if question_answer_pair['question'] not in question_logs:
question_logs[question_answer_pair['question']] = copy.deepcopy(question_answer_pair)
choices=question_answer_pair['choices']
Otherwise, whenever you run
result = tifa_score_single(vqa_model, filtered_questions, img_path)
you are changing the original filtered_questions
, and result contain a reference to filtered_questions
. Wierd things would happen. For example, if you make a new call with the same filtered_questions
, the result from the previous call would be changed.
I tried running tifa_test.py on single image with your drawbench_8.jpg (with llama2) and it worked. When i tried running it with my own image(size of 1150*750) and it returned OOM issue. I'm on L4 gpu(24gb vram).
To bypass the issue, I emptied the gpu cache after after llama tasks and ran tifa_score_single.
what your suggestion for minimal vram?
Hello, thank you for your repo.
When I test some cases with tifa, I find that the model totally confused right and left, below is an example:
what I generate is as follows:
The result by mPLUG is {'id': 'paintskill_29', 'caption': 'a photo of bird and boat; boat is right to bird', 'question': 'is the boat right to or left to the bird?', 'choices': ['right to', 'left to', 'in front of', 'behind'], 'answer': 'right to', 'element_type': 'spatial', 'element': 'right to', 'free_form_vqa': 'left', 'multiple_choice_vqa': 'left to', 'scores': 0, 'pred_image_path': '/share/project/yhy/project/frag/image_editing_pipeline/baseline/LayoutLLM_T2I_main/auto_RAIG_output/tifa/1111.png'}
Similarly, for the image:
The result is
{'id': 'paintskill_14', 'caption': 'a photo of bike and chair; chair is below bike', 'question': 'is the chair below or above the bike?', 'choices': ['below', 'above', 'next to', 'behind'], 'answer': 'below', 'element_type': 'spatial', 'element': 'below', 'free_form_vqa': 'above', 'multiple_choice_vqa': 'above', 'scores': 0, 'pred_image_path': '/share/project/yhy/project/frag/image_editing_pipeline/baseline/LayoutLLM_T2I_main/auto_RAIG_output/tifa/1072.png'}
Is there something wrong?
Can you help me verify this result to test whether it is the bug of my code? (Actually I barely changed the repository code)
Thank you in advance.
Hi
Thank you for your great work. I try to use your repo but so far run into problems when trying to reach openai servers.
Traceback (most recent call last):
File "/home/anasrezklinux/anas_april/visual_story.py", line 1133, in <module>
custom_diffusion_inference([character_1, character_2], step, lr)
File "/home/anasrezklinux/anas_april/visual_story.py", line 512, in custom_diffusion_inference
TIFA_metric_score, DALL_eval_score, ViTS_16_DINO_embeddings = score_images(image_path, real_photo_path_list, prompt)
File "/home/anasrezklinux/anas_april/visual_story.py", line 53, in score_images
return TIFA_metric_score(prompt, image_path),DALL_eval_score(prompt, image_path),[ViTS_16_DINO_embeddings(image_path, real_image_path) for real_image_path in real_image_paths]
File "/home/anasrezklinux/anas_april/compile_story.py", line 86, in TIFA_metric_score
gpt3_questions = get_question_and_answers(prompt)
File "/home/anasrezklinux/anas_april/tifa/tifascore/question_gen.py", line 547, in get_question_and_answers
resp = openai_completion(this_prompt)
File "/home/anasrezklinux/anas_april/tifa/tifascore/openai_api.py", line 6, in openai_completion
resp = openai.ChatCompletion.create(
File "/home/anasrezklinux/anas_april/venv/lib/python3.10/site-packages/openai/lib/_old_api.py", line 39, in __call__
raise APIRemovedInV1(symbol=self._symbol)
openai.lib._old_api.APIRemovedInV1:
You tried to access openai.ChatCompletion, but this is no longer supported in openai>=1.0.0 - see the README at https://github.com/openai/openai-python for the API.
You can run `openai migrate` to automatically upgrade your codebase to use the 1.0.0 interface.
Alternatively, you can pin your installation to the old version, e.g. `pip install openai==0.28`
A detailed migration guide is available here: https://github.com/openai/openai-python/discussions/742
for now I will revert to openai==0.28 , yet, it would be great if you could update this repo :)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.