Concept-aware Training

This repository contains training and evaluation sources to train in-context few-shot learners to utilize concepts in prediction.

Before reproducing the training, note that we make the CoAT-trained models publicly available. If you simply want to reproduce our results, proceed to the Evaluation section below and pick the model of your interest.

Training

The training of concept-aware model can be reproduced by running the following scripts.

git clone {this_repo}
cd {this_repo}
export PYTHONPATH="${PYTHONPATH}:$(pwd)"
pip install -r training/requirements.txt
pip install -r evaluation/requirements.txt

cd training
chmod 777 download_teaberac_data.sh
./download_teaberac_data.sh
cd ..

CUDA_VISIBLE_DEVICES=0 python training/train_mt5_teabreac+qa_coat.py

The script intentionally contains all parameters fixed, but if you need to change something, e.g. due to the environment restrictions, do not hesitate to adjust AdaptationArguments or evaluations within the code.

The training scripts include evaluations on SuperGLUE and various TeaBReaC concepts.

Baseline: Random Demonstrations Selection Training

In the sequence above, replace the python script path with train_mt5_teabreac+qa_random.py.

CUDA_VISIBLE_DEVICES=0 python training/train_mt5_teabreac+qa_random.py

Evaluations

We make the following pre-trained models from the paper publicly available:

Tk-CoAT-1B corresponds to authoranonymous321/mt5_large-teabreac-AQA_CoAT
Tk-CoAT-3B corresponds to authoranonymous321/mt5_3B-teabreac-AQA_CoAT
Tk-Random-1B corresponds to authoranonymous321/mt5_large-teabreac-AQA_random
Tk-CoAT-1B corresponds to authoranonymous321/mt5_3B-teabreac-AQA_random
Tk-Info-3B corresponds to authoranonymous321/mt5_3B-teabreac-AQA_informative

Concept-learning ability evaluation

To extract the concepts from explanations as proposed in the paper, and run the Concept-learning evaluation on a selected model, run sensitivity_evaluator.py script:

cd {this_repo}
export PYTHONPATH="${PYTHONPATH}:$(pwd)"
pip install -r evaluation/requirements.txt
spacy download en_core_web_sm  # For OpenBookQA concepts extraction

CUDA_VISIBLE_DEVICES=0 python evaluation/sensitivity_evaluator.py \
    --model_names_or_paths authoranonymous321/mt5_large-teabreac-AQA_CoAT \
    --bootstrap True \
    --metric ROUGE \
    --tasks glue/mnli,openbookqa/additional,hotpot_qa/fullwiki,worldtree \

All resources and concepts extractions should be resolved automatically.

If you evaluate using --bootstrapping True, collect the stdout to a file and analyse the results using this notebook.

Semantic priors evaluation

To evaluate models' reliance on their semantic representation of labels, run the semantic_priors_evaluator.py script:

cd {this_repo}
export PYTHONPATH="${PYTHONPATH}:$(pwd)"
pip install -r evaluation/requirements.txt

CUDA_VISIBLE_DEVICES=0 python evaluation/semantic_priors_evaluator.py \
    --model_names_or_paths authoranonymous321/mt5_large-teabreac-AQA_CoAT \
    --bootstrap True \
    --aggregate_results True \
    --metric ROUGE \
    --tasks axb,boolq,cb,wsc,multirc,rte,wic,axg \
    --firstn 100

With --bootstrap True and --aggregate_results False, the results can be vizualized using this notebook. To assess the results directly, use --aggregate_results True instead. To evaluate on full datasets, set --firstn 0.

End tasks evaluation

To reproduce our evaluation on SuperGLUE, run the following:

cd {this_repo}
export PYTHONPATH="${PYTHONPATH}:$(pwd)"
CUDA_VISIBLE_DEVICES=0 python evaluation/superglue_evaluator.py \
    --model_names_or_paths authoranonymous321/mt5_large-teabreac-AQA_CoAT,allenai/tk-instruct-large-def-pos \
    --metric ROUGE \
    --tasks axb,boolq,cb,wsc,copa,multirc,rte,wic,record,axg

All resources should be resolved automatically.

Citation

If you use Concept-learning Evaluation in scientific work, please cite this work as follows:

@inproceedings{stefanik2023incontext,
               author = {{{\v{S}}tef{\'a}nik}, Michal and {Kadl{\v{c}}{\'\i}k}, Marek},
               title={Can In-context Learners Learn a Reasoning Concept from Demonstrations?}, 
               booktitle = {Proceedings of ACL 2023: Natural Language Reasoning and Structured Explanations (NLRSE)},
               publisher = {ACL},
               numpages = {6},
               year={2023},
               url = {https://arxiv.org/abs/2212.01692},
}

If you'd like to reference Concept-Aware Training, please cite other paper that introduces it:

@article{stefanik2023conceptaware,
         title={Concept-aware Training Improves In-context Learning Ability of Language Models}, 
         author={{{\v{S}}tef{\'a}nik}, Michal and {Kadl{\v{c}}{\'\i}k}, Marek},
         year={2023},
         eprint={2305.13775},
         archivePrefix={arXiv},
         primaryClass={cs.CL},
         url = {https://arxiv.org/abs/2305.13775},
}

mir-mu / coat Goto Github PK

coat's Introduction

Concept-aware Training

Training

Baseline: Random Demonstrations Selection Training

Evaluations

Concept-learning ability evaluation

Semantic priors evaluation

End tasks evaluation

Citation

coat's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent