Git Product home page Git Product logo

gemel's Introduction

GEMEL: Generative Multimodal Entity Linking

✨ Overview

This repository contains the official implementation of our LREC-COLING 2024 paper, Generative Multimodal Entity Linking.

GEMEL is a simple yet effective Generative Multimodal Entity Linking framework based on Large Language Models (LLMs), which directly generates target entity names. We keep the vision and language model frozen and only train a feature mapper to enable cross-modality interactions. Extensive experiments show that, with only ~0.3% of the model parameters fine-tuned, GEMEL achieves state-of-the-art results on two well-established MEL datasets, namely WikiDiverse and WikiMEL. The performance gain stems from mitigating the popularity bias of LLM predictions and disambiguating less common entities effectively. Our framework is compatible with any off-the-shelf language model, paving the way towards an efficient and general solution for utilizing LLMs in the MEL task.

Checkpoints and preprocessed data can be accessed here.

If you have any question, please feel free to contact me via email at [email protected] or submit your issue in the repository.

🔥 News

[23.07.14] We release the codes and the checkpoints of GEMEL.

[24.03.19] We have updated our paper.

🚀 Architecture

Here, you can see the detailed architecture and some experimental analyses of GEMEL.

GEMEL

🚨 Usage

Environment

conda create -n GEMEL python=3.7
conda activate GEMEL
pip install -r requirements.txt

For different CUDA versions you need to install the corresponding PyTorch package. Find the appropriate installation package on the PyTorch website. To install PyTorch, we use the following command:

pip install torch==1.12.0+cu116 torchvision==0.13.0+cu116 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu116

Data

We have preprocessed the text, image, and knowledge base data. Download data from here and move to the ./data folder. Here we offer guidelines on how to build and use a prefix tree for constrained decoding.

train.json, dev.json, test.json         ->      textual data files
clip_vit_large_patch14_1024.hdf5        ->      visual data file
prefix_tree_opt.pkl                     ->      prefix tree of entity name
SimCSE_train_mention_embeddings.pkl     ->      training set mention embeddings

Train

Running main.py directly will use the WikiDiverse dataset, opt-6.7b model:

python main.py

The model structure is in model.py, the default parameters are in params.py, and most of the data processing is in utils.py.

You can customize some parameter settings, see params.py for details. Here are some examples of how to train GEMEL:

For training with the WikiDiverse dataset:

python main.py --dataset wikidiverse --model_name opt-6.7b --ICL_examples_num 16

For training with the WikiMEL dataset:

python main.py --dataset wikimel --model_name opt-6.7b --ICL_examples_num 16

Test

Download the checkpoint from here and move to the ./checkpoint folder.

For testing on WikiDiverse test set:

python infe.py --dataset wikidiverse --model_name opt-6.7b --best_ckpt opt-6.7b_wikidiverse_linear_4token_16examples_82_77.pkl

For testing on WikiMEL test set:

python infe.py --dataset wikimel --model_name opt-6.7b --best_ckpt opt-6.7b_wikimel_linear_4token_16examples_75_53.pkl

Citation

@article{shi2023generative,
  title={Generative Multimodal Entity Linking},
  author={Shi, Senbao and Xu, Zhenran and Hu, Baotian and Zhang, Min},
  journal={arXiv preprint arXiv:2306.12725},
  year={2023}
}

gemel's People

Contributors

senbao-shi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

gemel's Issues

对比实验代码

请问方便提供任何对比实验的代码吗?(GENRE与GHMFC)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.