Git Product home page Git Product logo

syntaxnet_wrapper's Introduction

A Python Wrapper for Google SyntaxNet

Installation

Prerequisites

Install OpenJDK8.

add-apt-repository -y ppa:openjdk-r/ppa
apt-get -y update
apt-get -y install openjdk-8-jdk

Install bazel and include bazel in $PATH.

Note: Only bazel 0.4.3 is runnable. bazel 0.4.4 may cause errors.

wget https://github.com/bazelbuild/bazel/releases/download/0.4.3/bazel-0.4.3-installer-linux-x86_64.sh
chmod +x bazel-0.4.3-installer-linux-x86_64.sh
./bazel-0.4.3-installer-linux-x86_64.sh --user
rm bazel-0.4.3-installer-linux-x86_64.sh
export PATH="$PATH:$HOME/bin"

Install system package dependencies.

apt-get -y install swig unzip

Install Python packages

Note: Current version of syntaxnet must be used with tensorflow r1.0.

pip install tensorflow protobuf asciitree mock

Start Installing

pip install git+ssh://[email protected]/livingbio/syntaxnet_wrapper.git#egg=syntaxnet_wrapper

If installation failed...

Execute test.sh, you should see following outputs:

1       Bob     _       PROPN   NNP     Number=Sing|fPOS=PROPN++NNP     2       nsubj   _       _
2       brought _       VERB    VBD     Mood=Ind|Tense=Past|VerbForm=Fin|fPOS=VERB++VBD 0       ROOT    _ _
3       the     _       DET     DT      Definite=Def|PronType=Art|fPOS=DET++DT  4       det     _       _
4       pizza   _       NOUN    NN      Number=Sing|fPOS=NOUN++NN       2       dobj    _       _
5       to      _       ADP     IN      fPOS=ADP++IN    6       case    _       _
6       Alice.  _       PROPN   NNP     Number=Sing|fPOS=PROPN++NNP     2       nmod    _       _

1       球      _       PROPN   NNP     fPOS=PROPN++NNP 4       nsubj   _       _
2       從      _       ADP     IN      fPOS=ADP++IN    3       case    _       _
3       天上    _       NOUN    NN      fPOS=NOUN++NN   4       nmod    _       _
4       掉      _       VERB    VV      fPOS=VERB++VV   0       ROOT    _       _
5       下來    _       VERB    VV      fPOS=VERB++VV   4       mark    _       _

球 從天 上 掉 下 來

If the outputs are correct, problems are caused by the wrapper. If the outputs are wrong, compilation of syntaxnet may be failed.

Usage

from syntaxnet_wrapper import tagger, parser

print tagger['en'].query('this is a good day', returnRaw=True)
# 1       this    _       DET     DT      _       0       _       _       _
# 2       is      _       VERB    VBZ     _       0       _       _       _
# 3       a       _       DET     DT      _       0       _       _       _
# 4       good    _       ADJ     JJ      _       0       _       _       _
# 5       day     _       NOUN    NN      _       0       _       _       _
tagger['en'].query('this is a good day')  # in default, return splitted text

print parser['en'].query('Alice drove down the street in her car', returnRaw=True)
# 1       Alice   _       NOUN    NNP     _       2       nsubj   _       _
# 2       drove   _       VERB    VBD     _       0       ROOT    _       _
# 3       down    _       ADP     IN      _       2       prep    _       _
# 4       the     _       DET     DT      _       5       det     _       _
# 5       street  _       NOUN    NN      _       3       pobj    _       _
# 6       in      _       ADP     IN      _       2       prep    _       _
# 7       her     _       PRON    PRP$    _       8       poss    _       _
# 8       car     _       NOUN    NN      _       6       pobj    _       _

# use Chinese model
print tagger['zh'].query(u'今天 天氣 很 好', returnRaw=True)
# 1       今天    _       NOUN    NN      fPOS=NOUN++NN   0       _       _       _
# 2       天氣    _       NOUN    NN      fPOS=NOUN++NN   0       _       _       _
# 3       很      _       ADV     RB      fPOS=ADV++RB    0       _       _       _
# 4       好      _       ADJ     JJ      fPOS=ADJ++JJ    0       _       _       _

print parser['zh'].query(u'今天 天氣 很 好', returnRaw=True)
# 1       今天    _       NOUN    NN      fPOS=NOUN++NN   4       nmod:tmod       _       _
# 2       天氣    _       NOUN    NN      fPOS=NOUN++NN   4       nsubj   _       _
# 3       很      _       ADV     RB      fPOS=ADV++RB    4       advmod  _       _
# 4       好      _       ADJ     JJ      fPOS=ADJ++JJ    0       ROOT    _       _

Language Selection

The default model is 'English-Parsey'. This is announced by Google on May, 2016. Other models, includes 'English', are trained by Universal Dependencies, announced by Google on August, 2016.

from syntaxnet_wrapper import language_code_to_model_name
language_code_to_model_name
# {'ar': 'Arabic',
#  'bg': 'Bulgarian',
#  'ca': 'Catalan',
#  'cs': 'Czech',
#  'da': 'Danish',
#  'de': 'German',
#  'el': 'Greek',
#  'en': 'English-Parsey',
#  'en-uni': 'English',
#  'es': 'Spanish',
#  'et': 'Estonian',
#  'eu': 'Basque',
#  'fa': 'Persian',
#  'fi': 'Finnish',
#  'fr': 'French',
#  'ga': 'Irish',
#  'gl': 'Galician',
#  'hi': 'Hindi',
#  'hr': 'Croatian',
#  'hu': 'Hungarian',
#  'id': 'Indonesian',
#  'it': 'Italian',
#  'iw': 'Hebrew',
#  'kk': 'Kazakh',
#  'la': 'Latin',
#  'lv': 'Latvian',
#  'nl': 'Dutch',
#  'no': 'Norwegian',
#  'pl': 'Polish',
#  'pt': 'Portuguese',
#  'ro': 'Romanian',
#  'ru': 'Russian',
#  'sl': 'Slovenian',
#  'sv': 'Swedish',
#  'ta': 'Tamil',
#  'tr': 'Turkish',
#  'zh': 'Chinese',
#  'zh-cn': 'Chinese',
#  'zh-tw': 'Chinese'}

syntaxnet_wrapper's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

syntaxnet_wrapper's Issues

Licence

Thank you very much for this library!

Have you considered making this project open source by putting it under an open source licence? It would ease it reuse.

error reference before assignment

from syntaxnet_wrapper import tagger, parser

print tagger['en'].query('this is a good day', returnRaw=True)

conll_text reference before assignment

估計是命名錯誤, 直接處理

research folder

It appears to me that since this was written, syntaxnet has a new "research" subfolder and this library needs to be modified to reflect that as the file paths in the code are no longer correct.

fatal: Could not read from remote repository.

pip install git+ssh://[email protected]/livingbio/syntaxnet_wrapper.git#egg=syntaxnet_wrapper
Collecting syntaxnet_wrapper from git+ssh://[email protected]/livingbio/syntaxnet_wrapper.git#egg=syntaxnet_wrapper
Cloning ssh://[email protected]/livingbio/syntaxnet_wrapper.git to /tmp/pip-build-lXMCyv/syntaxnet-wrapper
Warning: Permanently added the RSA host key for IP address '192.30.253.113' to the list of known hosts.
Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
Command "git clone -q ssh://[email protected]/livingbio/syntaxnet_wrapper.git /tmp/pip-build-lXMCyv/syntaxnet-wrapper" failed with error code 128 in None
You are using pip version 8.1.1, however version 9.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
florin@florin-MachineLearning:~$

Can't install

When I try to install it: an error will pop out:
fatal: Could not read from remote repository.

Parser and tagger not working

Installing collected packages: virtualenv, syntaxnet-wrapper
  Running setup.py install for syntaxnet-wrapper ... done
Successfully installed syntaxnet-wrapper-0.4.1 virtualenv-15.1.0
root@b9190d16cf5f:/# python
Python 2.7.12+ (default, Sep 17 2016, 12:08:02)
[GCC 6.2.0 20160914] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from syntaxnet_wrapper import tagger, parser
>>> print tagger['en'].query('this is a good day', returnRaw=True)
None
>>> print parser['en'].query('Alice drove down the street in her car', returnRaw=True)
None
>>>

Python usage returns None

I have tried re-installing several times, but every time, test.sh gives expected results, but python usage returns None.

In the log, I am seeing compilation WARNINGs, otherwise everything is successful. It has executed 25 tests and all have passed.

Python version: 2.7.12
Platform: Linux-4.4.0-83-generic-x86_64-with-Ubuntu-16.04-xenial
Bazel version: 0.4.3
Tensorflow version: 1.0.0
Installation type: Non-Docker (Inside a virtual python environment; Global environment doesn't have Syntaxnet, Tensorflow, etc.)

Break-up of 661 warnings:
warning: ISO C++ forbids converting a string constant to 'char*' [-Wwrite-strings] 215
warning: comparison between signed and unsigned integer expressions [-Wsign-compare] 173
defined but not used [-Wunused-function] 3
defined but not used [-Wunused-variable] 115
may be used uninitialized in this function [-Wmaybe-uninitialized] 150
warning: control reaches end of non-void function [-Wreturn-type] 1
warning: format '%lld' expects argument of type 'long long int', but argument 2 has type 'google::protobuf::int64 {aka long int}' [-Wformat=] 3
warning: unrecognized command line option '-Wno-self-assign' 1

syntaxnet process process

目前syntaxnet 都是採用subprocess 來做io
這部分目前會出現一些問題, 包含重複開關的問題..

Loading of neural network/model

Will it load the model or the Parsey neural net every time we do a query, or will it save the model in memory, so reduce the query time?

return none

hi
after write print tagger['en'].query('this is a good day', returnRaw=True)
output is None.
why?

change is Ready

修正 isReady 邏輯
目前用 tensorflow 的log 判定, 這部分有風險,
改成load 完成會有head prefix

## input content:

parser 失敗

交叉比對後是結尾需要是換行..
與其這樣說不如說是每一行的結尾必須是換行

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.