Git Product home page Git Product logo

linguistic-self-attention-for-srl's Introduction

Linguistic-Self-Attention-for-SRL

This work is based on the work by @strubell and this repo.

Original paper

Requirements:

  • python >= 3.6.10

Environment setup:

pip install -r requirements.txt

CoNLL-2012 data setup:

  • Get CONLL-2012 dataset.

  • Clone this repo.

  • Run bin/pp12.sh in the IESL repo directory.

  • Run bin/prepare_data.sh [conll12 preprocessed files directory]

  • If you want to run the model on English GloVe embeddings, you may download them:

wget -P embeddings http://nlp.stanford.edu/data/glove.6B.zip
unzip -j embeddings/glove.6B.zip glove.6B.100d.txt -d embeddings

Russian Framebank data setup:

Running:

python src/train.py \
--data [your data path config]
--config [your global config]
--save_dir [directory to save models, vocabs and metrics]
--eval_every [int]
--save_every [int]

Data path config is a following .json file:

{
    "train": ["file1", "file2", ...],
    "dev": ["file1", "file2", ...],
    "test": ["file1", "file2", ...],  # not required
    "transition_stats": "file"  # required if crf/viterbi decoding is used
}

Global config:

{
  "data_configs": ["config_file1", "config_file2", ...],
  "model_configs": ["config_file1", "config_file2", ...],
  "task_configs": ["config_file1", "config_file2", ...],
  "layer_configs": ["config_file1", "config_file2", ...],
  "attention_configs": ["config_file1", "config_file2", ...],
}

Find out more about config files in config/README.md.

Code reading starter pack

  1. model.py has main logic, as usual
  2. Due to keras limitations for multi-output models, all metrics and losses are computed inside the model. On inference, loss and metrics are returned empty (didn’t test it yet).
  3. transormer_layer.py adjusts Multi-Head Attention Encoder implementation by OpenNMT
  4. output_fns.py contains all output layers. All current model outputs are passed into the layer, then BaseFunction proxies parameters into layer-specific format.
  5. attention_fns.py implements outputs that will be used in further transformer layers

Notes

  • We use OpenNMT implementation of the multi-head self-attention encoder
  • While training, long sentences (more than 100 tokens) are ignored to save GPU memory
  • Training an English model on Tesla-1080 takes up to 6 hours.
  • Training a Russian model in Colab takes up to 2 hours.
  • Some output layers currently support only one predicate per sentence.
  • (some kind of inference script for FrameBank might be coming up soon)

linguistic-self-attention-for-srl's People

Contributors

andrewgolman avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.