Git Product home page Git Product logo

gain's Introduction

USTC-NELSLIP-SemEval2022Task11-GAIN

Winner system (USTC-NELSLIP) of SemEval 2022 MultiCoNER shared task over 3 out of 13 tracks (Chinese, Bangla, Code-Mixed). Rankings: https://multiconer.github.io/results.

This repository containing the training and prediction code of the system developed by the USTC-NELSLIP team for SemEval-2022 Task 11 MultiCoNER.

We provide code of two gazetteer-based methods used in our final system, GAIN and weighted summation integration with gazetteer method.

GAIN: Gazetteer-Adapted Integration Network with crf classifier mentioned in Section 3.3 in paper.

weighted_fusion_crf: Weighted summation integration with gazetteer method using crf classifier mentioned in Section 3.2 in paper.

Overall Structure

Image text

Citation

If you use this code, please cite the paper below:

USTC-NELSLIP at SemEval-2022 Task 11: Gazetteer-Adapted Integration Network for Multilingual Complex Named Entity Recognition

@inproceedings{chen-etal-2022-ustc, title = "{USTC}-{NELSLIP} at {S}em{E}val-2022 Task 11: Gazetteer-Adapted Integration Network for Multilingual Complex Named Entity Recognition", author = "Chen, Beiduo and Ma, Jun-Yu and Qi, Jiajun and Guo, Wu and Ling, Zhen-Hua and Liu, Quan", booktitle = "Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)", month = jul, year = "2022", address = "Seattle, United States", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2022.semeval-1.223", pages = "1613--1622", abstract = "This paper describes the system developed by the USTC-NELSLIP team for SemEval-2022 Task 11 Multilingual Complex Named Entity Recognition (MultiCoNER). We propose a gazetteer-adapted integration network (GAIN) to improve the performance of language models for recognizing complex named entities. The method first adapts the representations of gazetteer networks to those of language models by minimizing the KL divergence between them. After adaptation, these two networks are then integrated for backend supervised named entity recognition (NER) training. The proposed method is applied to several state-of-the-art Transformer-based NER models with a gazetteer built from Wikidata, and shows great generalization ability across them. The final predictions are derived from an ensemble of these trained models. Experimental results and detailed analysis verify the effectiveness of the proposed method. The official results show that our system ranked 1st on three tracks (Chinese, Code-mixed and Bangla) and 2nd on the other ten tracks in this task.", }

Getting Started

Setting up the code environment

$ pip install -r requirements.txt

Arguments

Most of our arguments are the same as those in MULTI-CONER NER Baseline System.

Notice that we add argument gazetteer to introduce the path of gazetteer.

    p.add_argument('--train', type=str, help='Path to the train data.', default=None)
    p.add_argument('--test', type=str, help='Path to the test data.', default=None)
    p.add_argument('--dev', type=str, help='Path to the dev data.', default=None)
    p.add_argument('--gazetteer', type=str, help='Path to the gazetteer data.', default=None)

    p.add_argument('--out_dir', type=str, help='Output directory.', default='.')
    p.add_argument('--iob_tagging', type=str, help='IOB tagging scheme', default='wnut')

    p.add_argument('--max_instances', type=int, help='Maximum number of instances', default=-1)
    p.add_argument('--max_length', type=int, help='Maximum number of tokens per instance.', default=128)

    p.add_argument('--encoder_model', type=str, help='Pretrained encoder model to use', default='xlm-roberta-large')
    p.add_argument('--keep_training_model', type=str, help='keep Pretrained encoder model to use', default='')
    p.add_argument('--model', type=str, help='Model path.', default=None)
    p.add_argument('--model_name', type=str, help='Model name.', default=None)
    p.add_argument('--stage', type=str, help='Training stage', default='fit')
    p.add_argument('--prefix', type=str, help='Prefix for storing evaluation files.', default='test')

    p.add_argument('--batch_size', type=int, help='Batch size.', default=128)
    p.add_argument('--gpus', type=int, help='Number of GPUs.', default=1)
    p.add_argument('--epochs', type=int, help='Number of epochs for training.', default=5)
    p.add_argument('--lr', type=float, help='Learning rate', default=1e-5)
    p.add_argument('--dropout', type=float, help='Dropout rate', default=0.1)

Running

1. Move into the folder of method you chose

cd AGAN or cd weighted_fusion_crf

Before you running any shell file, you need to modify the arguments to your own paths or hyper-parameters at first.

2. Training

Train a xlm-roberta-large based model. The pretrained xlmr model is from HuggingFace

bash run_train.sh

3. Fine-Tuning

Fine-tuning from a pretrained NER model.

bash run_finetune.sh

4. Predicting

Predicting the tags from a pretrained model.

bash run_predict.sh

Reference

MULTI-CONER NER Baseline System

License

The code under this repository is licensed under the Apache 2.0 License.

gain's People

Contributors

mckysse avatar qijiajun avatar mjy1111 avatar

Stargazers

 avatar  avatar Yang SiYi avatar Yixiao Yuan avatar Dylan Guo avatar Denis Gordeev avatar Arthur Wu avatar  avatar  avatar  avatar  avatar Xinyu Wang avatar  avatar

Watchers

 avatar  avatar

gain's Issues

Request for Codebase and Further Information on Gazetteer Construction Process

Currently, I am engaged in a similar project, focusing on NER . Your work has provided me with a unique perspective and has been instrumental in shaping my understanding of the task at hand. I thoroughly enjoyed reading your paper and found it to be highly informative.

However, I am particularly interested in gaining a deeper understanding of the Gazetteer construction process that you have employed in your project. If possible, could you kindly share the codebase or any similar repositories that you may have used during your research? I am keen to understand how the mapping was executed programmatically.

Additionally, I would greatly appreciate it if you could shed some light on the types of manual adjustments of mapping relationships that you had to perform during the process. Any insights or experiences you could share would be invaluable to my ongoing work.

Request for Gazetteer Construction Source Code for NER Project

Hello, I wanted to reach out after reading your interesting work. It's been really helpful for my own project, especially since I'm working on NER for Vietnamese.

Your paper was a great read! I was wondering if you could share kindly more about how you built the Gazetteer. Do you have any code or resources you used? I'm particularly interested in how you did the mapping part.

Thanks a lot for considering my request. Looking forward to hearing from you!

Gazetteer Constructuon Details

Hello! first of all, really appreciate your interesting work and project on the NER task. I too am working on a NER for the Bengali language and your work provides a really interesting insight. The paper was a very good read. However, I was wondering whether it would be possible to get some more info on the Gazetteer construction process - particularly a codebase or similar repos, how the mapping was performed programmatically and the types of manual adjustments of mapping relationships you have had to perform.
Thanks.

list index out of range

你好,我在运行训练代码时,遇到了这样的问题。能解答一下吗?感谢!!
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.