This project is for my thesis with the architecture is the combination of mPLUG model and SimVLM with some additional modification is Text-Guided Attention and Image-Guided Attention.
luong1409 / vqa_thesis Goto Github PK
View Code? Open in Web Editor NEWThis project is for my thesis with the architecture is the combination of mPLUG model and SimVLM with some additional modification is Text-Guided Attention and Image-Guided Attention.