Linguistic-Self-Attention-for-SRL

This work is based on the work by @strubell and this repo.

Requirements:

python >= 3.6.10

Environment setup:

pip install -r requirements.txt

CoNLL-2012 data setup:

Get CONLL-2012 dataset.
Clone this repo.
Run bin/pp12.sh in the IESL repo directory.
Run bin/prepare_data.sh [conll12 preprocessed files directory]
If you want to run the model on English GloVe embeddings, you may download them:

wget -P embeddings http://nlp.stanford.edu/data/glove.6B.zip
unzip -j embeddings/glove.6B.zip glove.6B.100d.txt -d embeddings

Russian Framebank data setup:

Follow preprocessing from isanlp_srl_framebank
Put generated files features.pckl and ling_data.pckl into data directory.
Run bin/framebank_preprocess.py.
If you want to run the model on FastText embeddings, you may download them at https://fasttext.cc/docs/en/crawl-vectors.html

Running:

python src/train.py \
--data [your data path config]
--config [your global config]
--save_dir [directory to save models, vocabs and metrics]
--eval_every [int]
--save_every [int]

Data path config is a following .json file:

{
    "train": ["file1", "file2", ...],
    "dev": ["file1", "file2", ...],
    "test": ["file1", "file2", ...],  # not required
    "transition_stats": "file"  # required if crf/viterbi decoding is used
}

Global config:

{
  "data_configs": ["config_file1", "config_file2", ...],
  "model_configs": ["config_file1", "config_file2", ...],
  "task_configs": ["config_file1", "config_file2", ...],
  "layer_configs": ["config_file1", "config_file2", ...],
  "attention_configs": ["config_file1", "config_file2", ...],
}

Find out more about config files in config/README.md.

Code reading starter pack

model.py has main logic, as usual
Due to keras limitations for multi-output models, all metrics and losses are computed inside the model. On inference, loss and metrics are returned empty (didn’t test it yet).
transormer_layer.py adjusts Multi-Head Attention Encoder implementation by OpenNMT
output_fns.py contains all output layers. All current model outputs are passed into the layer, then BaseFunction proxies parameters into layer-specific format.
attention_fns.py implements outputs that will be used in further transformer layers

Notes

We use OpenNMT implementation of the multi-head self-attention encoder
While training, long sentences (more than 100 tokens) are ignored to save GPU memory
Training an English model on Tesla-1080 takes up to 6 hours.
Training a Russian model in Colab takes up to 2 hours.
Some output layers currently support only one predicate per sentence.
(some kind of inference script for FrameBank might be coming up soon)

andrewgolman / linguistic-self-attention-for-srl Goto Github PK

linguistic-self-attention-for-srl's Introduction

Linguistic-Self-Attention-for-SRL

This work is based on the work by @strubell and this repo.

Requirements:

Environment setup:

CoNLL-2012 data setup:

Russian Framebank data setup:

Running:

Code reading starter pack

Notes

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent