Git Product home page Git Product logo

coat's Introduction

Concept-aware Training

This repository contains training and evaluation sources to train in-context few-shot learners to utilize concepts in prediction.

Before reproducing the training, note that we make the CoAT-trained models publicly available. If you simply want to reproduce our results, proceed to the Evaluation section below and pick the model of your interest.

Training

The training of concept-aware model can be reproduced by running the following scripts.

git clone {this_repo}
cd {this_repo}
export PYTHONPATH="${PYTHONPATH}:$(pwd)"
pip install -r training/requirements.txt
pip install -r evaluation/requirements.txt

cd training
chmod 777 download_teaberac_data.sh
./download_teaberac_data.sh
cd ..

CUDA_VISIBLE_DEVICES=0 python training/train_mt5_teabreac+qa_coat.py

The script intentionally contains all parameters fixed, but if you need to change something, e.g. due to the environment restrictions, do not hesitate to adjust AdaptationArguments or evaluations within the code.

The training scripts include evaluations on SuperGLUE and various TeaBReaC concepts.

Baseline: Random Demonstrations Selection Training

In the sequence above, replace the python script path with train_mt5_teabreac+qa_random.py.

CUDA_VISIBLE_DEVICES=0 python training/train_mt5_teabreac+qa_random.py

Evaluations

We make the following pre-trained models from the paper publicly available:

  • Tk-CoAT-1B corresponds to authoranonymous321/mt5_large-teabreac-AQA_CoAT
  • Tk-CoAT-3B corresponds to authoranonymous321/mt5_3B-teabreac-AQA_CoAT
  • Tk-Random-1B corresponds to authoranonymous321/mt5_large-teabreac-AQA_random
  • Tk-CoAT-1B corresponds to authoranonymous321/mt5_3B-teabreac-AQA_random
  • Tk-Info-3B corresponds to authoranonymous321/mt5_3B-teabreac-AQA_informative

Concept-learning ability evaluation

To extract the concepts from explanations as proposed in the paper, and run the Concept-learning evaluation on a selected model, run sensitivity_evaluator.py script:

cd {this_repo}
export PYTHONPATH="${PYTHONPATH}:$(pwd)"
pip install -r evaluation/requirements.txt
spacy download en_core_web_sm  # For OpenBookQA concepts extraction

CUDA_VISIBLE_DEVICES=0 python evaluation/sensitivity_evaluator.py \
    --model_names_or_paths authoranonymous321/mt5_large-teabreac-AQA_CoAT \
    --bootstrap True \
    --metric ROUGE \
    --tasks glue/mnli,openbookqa/additional,hotpot_qa/fullwiki,worldtree \

All resources and concepts extractions should be resolved automatically.

If you evaluate using --bootstrapping True, collect the stdout to a file and analyse the results using this notebook.

Semantic priors evaluation

To evaluate models' reliance on their semantic representation of labels, run the semantic_priors_evaluator.py script:

cd {this_repo}
export PYTHONPATH="${PYTHONPATH}:$(pwd)"
pip install -r evaluation/requirements.txt

CUDA_VISIBLE_DEVICES=0 python evaluation/semantic_priors_evaluator.py \
    --model_names_or_paths authoranonymous321/mt5_large-teabreac-AQA_CoAT \
    --bootstrap True \
    --aggregate_results True \
    --metric ROUGE \
    --tasks axb,boolq,cb,wsc,multirc,rte,wic,axg \
    --firstn 100

With --bootstrap True and --aggregate_results False, the results can be vizualized using this notebook. To assess the results directly, use --aggregate_results True instead. To evaluate on full datasets, set --firstn 0.

End tasks evaluation

To reproduce our evaluation on SuperGLUE, run the following:

cd {this_repo}
export PYTHONPATH="${PYTHONPATH}:$(pwd)"
CUDA_VISIBLE_DEVICES=0 python evaluation/superglue_evaluator.py \
    --model_names_or_paths authoranonymous321/mt5_large-teabreac-AQA_CoAT,allenai/tk-instruct-large-def-pos \
    --metric ROUGE \
    --tasks axb,boolq,cb,wsc,copa,multirc,rte,wic,record,axg

All resources should be resolved automatically.

Citation

If you use Concept-learning Evaluation in scientific work, please cite this work as follows:

@inproceedings{stefanik2023incontext,
               author = {{{\v{S}}tef{\'a}nik}, Michal and {Kadl{\v{c}}{\'\i}k}, Marek},
               title={Can In-context Learners Learn a Reasoning Concept from Demonstrations?}, 
               booktitle = {Proceedings of ACL 2023: Natural Language Reasoning and Structured Explanations (NLRSE)},
               publisher = {ACL},
               numpages = {6},
               year={2023},
               url = {https://arxiv.org/abs/2212.01692},
}

If you'd like to reference Concept-Aware Training, please cite other paper that introduces it:

@article{stefanik2023conceptaware,
         title={Concept-aware Training Improves In-context Learning Ability of Language Models}, 
         author={{{\v{S}}tef{\'a}nik}, Michal and {Kadl{\v{c}}{\'\i}k}, Marek},
         year={2023},
         eprint={2305.13775},
         archivePrefix={arXiv},
         primaryClass={cs.CL},
         url = {https://arxiv.org/abs/2305.13775},
}

coat's People

Contributors

stefanik12 avatar

Watchers

Vít Starý Novotný avatar Michal Růžička avatar Petr Sojka avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.