Linguistic-Self-Attention-for-SRL

This work is based on the work by @strubell and this repo.

Requirements:

python >= 3.6.10

Environment setup:

pip install -r requirements.txt

CoNLL-2012 data setup:

Get CONLL-2012 dataset.
Clone this repo.
Run bin/pp12.sh in the IESL repo directory.
Run bin/prepare_data.sh [conll12 preprocessed files directory]
If you want to run the model on English GloVe embeddings, you may download them:

wget -P embeddings http://nlp.stanford.edu/data/glove.6B.zip
unzip -j embeddings/glove.6B.zip glove.6B.100d.txt -d embeddings

Russian Framebank data setup:

Follow preprocessing from isanlp_srl_framebank
Put generated files features.pckl and ling_data.pckl into data directory.
Run bin/framebank_preprocess.py.
If you want to run the model on FastText embeddings, you may download them at https://fasttext.cc/docs/en/crawl-vectors.html

Running:

python src/train.py \
--data [your data path config]
--config [your global config]
--save_dir [directory to save models, vocabs and metrics]
--eval_every [int]
--save_every [int]

Data path config is a following .json file:

{
    "train": ["file1", "file2", ...],
    "dev": ["file1", "file2", ...],
    "test": ["file1", "file2", ...],  # not required
    "transition_stats": "file"  # required if crf/viterbi decoding is used
}

Global config:

{
  "data_configs": ["config_file1", "config_file2", ...],
  "model_configs": ["config_file1", "config_file2", ...],
  "task_configs": ["config_file1", "config_file2", ...],
  "layer_configs": ["config_file1", "config_file2", ...],
  "attention_configs": ["config_file1", "config_file2", ...],
}

Find out more about config files in config/README.md.

Code reading starter pack

model.py has main logic, as usual
Due to keras limitations for multi-output models, all metrics and losses are computed inside the model. On inference, loss and metrics are returned empty (didn’t test it yet).
transormer_layer.py adjusts Multi-Head Attention Encoder implementation by OpenNMT
output_fns.py contains all output layers. All current model outputs are passed into the layer, then BaseFunction proxies parameters into layer-specific format.
attention_fns.py implements outputs that will be used in further transformer layers

Notes

We use OpenNMT implementation of the multi-head self-attention encoder
While training, long sentences (more than 100 tokens) are ignored to save GPU memory
Training an English model on Tesla-1080 takes up to 6 hours.
Training a Russian model in Colab takes up to 2 hours.
Some output layers currently support only one predicate per sentence.
(some kind of inference script for FrameBank might be coming up soon)

andrewgolman / linguistic-self-attention-for-srl Goto Github PK

linguistic-self-attention-for-srl's Introduction

Linguistic-Self-Attention-for-SRL

This work is based on the work by @strubell and this repo.

Requirements:

Environment setup:

CoNLL-2012 data setup:

Russian Framebank data setup:

Running:

Code reading starter pack

Notes

linguistic-self-attention-for-srl's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent