Git Product home page Git Product logo

ash-parser's Introduction

ash-parser

This was originally for a class project.

Utilizes a Chen and Manning (2014) style neural network parser in Python and TensorFlow. Many elements mimic SyntaxNet.

I analyze SyntaxNet's Architecture here.

parsing-config file is required to be created in the model directory before execution.

Run training_test.sh for an example of how to train a model. Evaluation during training works as well, but there is no API for tagging new input yet or serving a model.

External dependencies

  • NumPy
  • TensorFlow 1.0

Similarities to SyntaxNet

  • Same embedding system (configurable per-feature group deep embedding)
  • Same optimizer (Momentum with exponential moving average)
  • Lexicon builder is identical for words, tags, and labels
  • Map files output by SyntaxNet and AshParser should be identical
  • Evaluation metric is identical (SyntaxNet's corresponds to AshParser's UAS)
  • Feature system is almost identical (except perhaps some very rare corner cases)
  • Due to same architecture, accuracy should be very close to Greedy SyntaxNet

Differences from SyntaxNet:

  • Arc-Eager transition system also supported
  • Context file with redundant or boilerplate information is unnecessary
  • Supports GPU: training phase can complete in minutes
  • Pure Python3 implementation. No need for bazel
  • LAS (Labeled Attachment Score) prints out during evaluation
  • Precalculation and caching of feature bags. This makes it easier to train multiple models with the same token features but different hyperparameters
  • No support for structured (beam) parsing. Considering LSTM or something simpler and faster instead for the future. Accuracy loss should be in the ballpark of 1-2% due to this.
  • Feature groups are automatically created by groups of tag, word, and label rather than by grouping together with semicolon in a context file
  • Only support for the transition parser, not the POS tagger, morphological analyzer, or tokenizer
  • ngrams, punctuation_amount, morph tags and other features not yet implemented

ash-parser's People

Contributors

xtknight avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

koorukuroo

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.