Git Product home page Git Product logo

sling's Introduction

SLING - A natural language frame semantics parser

The SLING Project has moved

The SLING project has moved to https://github.com/ringgaard/sling.

Please refer to the above repo for the latest developments on SLING.

Background

The aim of the SLING project is to learn to read and understand Wikipedia articles in many languages for the purpose of knowledge base completion, e.g. adding facts mentioned in Wikipedia (and other sources) to the Wikidata knowledge base. We use frame semantics as a common representation for both knowledge representation and document annotation. The SLING parser can be trained to produce frame semantic representations of text directly without any explicit intervening linguistic representation.

The SLING project is still work in progress. We do not yet have a full system that can extract facts from arbitrary text, but we have built a number of the subsystems needed for such a system. The SLING frame store is our basic framework for building and manipulating frame semantic graph structures. The Wiki flow pipeline can take a raw dump of Wikidata and convert this into one big frame graph. This can be loaded into memory so we can do fast graph traversal for inference and reasoning over the knowledge base. The Wiki flow pipeline can also take raw Wikipedia dumps and convert these into a set of documents with structured annotations extracted from the Wiki markup. This also produces phrase tables that are used for mapping names to entities. There is a SLING Python API for accessing all this information and we also have a bot for uploading extracted facts to Wikidata.

The SLING Parser

The SLING parser is used for annotating text with frame semantic annotations. It is a general transition-based frame semantic parser using bi-directional LSTMs for input encoding and a Transition Based Recurrent Unit (TBRU) for output decoding. It is a jointly trained model using only the text tokens as input and the transition system has been designed to output frame graphs directly without any intervening symbolic representation.

SLING neural network architecture.

The SLING framework includes an efficient and scalable frame store implementation as well as a neural network JIT compiler for fast training and parsing.

A more detailed description of the SLING parser can be found in this paper:

More information ...

Credits

Original authors of the code in this package include:

  • Michael Ringgaard
  • Rahul Gupta
  • Anders Sandholm

sling's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sling's Issues

Can a single span invoke 2 different frames? (CASPAR)

Can a single span invoke 2 different frames? I'm confused. Here "price" invokes both #11 /saft/other and #14 /pb/price-01? This doesn't look right to me.

{=#1
  :/s/document
  /s/document/text: "Tell me price of the car"
  /s/document/tokens: [{=#2
    :/s/token
    /s/token/index: 0
    /s/token/text: "Tell"
    /s/token/start: 0
    /s/token/length: 4
    /s/token/break: 0
  }, {=#3
    :/s/token
    /s/token/index: 1
    /s/token/text: "me"
    /s/token/start: 5
    /s/token/length: 2
  }, {=#4
    :/s/token
    /s/token/index: 2
    /s/token/text: "price"
    /s/token/start: 8
    /s/token/length: 5
  }, {=#5
    :/s/token
    /s/token/index: 3
    /s/token/text: "of"
    /s/token/start: 14
    /s/token/length: 2
  }, {=#6
    :/s/token
    /s/token/index: 4
    /s/token/text: "the"
    /s/token/start: 17
    /s/token/length: 3
  }, {=#7
    :/s/token
    /s/token/index: 5
    /s/token/text: "car"
    /s/token/start: 21
    /s/token/length: 3
  }]
  /s/document/mention: {=#8
    :/s/phrase
    /s/phrase/begin: 0
    /s/phrase/evokes: {=#9
      :/pb/tell-01
      /pb/arg2: {=#10
        :/saft/other
      }
      /pb/arg1: {=#11
        :/saft/other
      }
    }
  }
  /s/document/mention: {=#12
    :/s/phrase
    /s/phrase/begin: 1
    /s/phrase/evokes: #10
  }
  /s/document/mention: {=#13
    :/s/phrase
    /s/phrase/begin: 2
    /s/phrase/evokes: #11
    /s/phrase/evokes: {=#14
      :/pb/price-01
      /pb/arg1: {=#15
        :/saft/consumer_good
      }
    }
  }
  /s/document/mention: {=#16
    :/s/phrase
    /s/phrase/begin: 5
    /s/phrase/evokes: #15
  }
}

cannot define or redeclare 'registry_' because namespace 'syntaxnet' does not enclose namespace 'Component<syntaxnet::dragnn::Component>'

I'm trying to run training script from README:

./sling/nlp/parser/tools/train.sh --report_every=500 --train_steps=1000

However I'm getting following error:

Writing command to local/sempar/out/command
INFO: Found 1 target...
ERROR: sling/third_party/syntaxnet/BUILD:26:1: C++ compilation of rule '//third_party/syntaxnet:syntaxnet' failed (Exit 1).
third_party/syntaxnet/dragnn/core/component_registry.cc:21:1: error: cannot define or redeclare 'registry_' here because namespace 'syntaxnet' does not enclose namespace 'Component<syntaxnet::dragnn::Component>'
REGISTER_COMPONENT_REGISTRY("DRAGNN Component", dragnn::Component);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./sling/base/registry.h:253:52: note: expanded from macro 'REGISTER_COMPONENT_REGISTRY'
  classname::Registry sling::Component<classname>::registry_ = { \
                      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
1 error generated.
Target //sling/nlp/parser/trainer:generate-master-spec failed to build

Do you see anything obvious I could fix to make it work?
Thanks.

One document per sentence

Hi,

While creating training set, a particular text document with multiple sentences could theoretically be either converted to single sling document or multiple.
E.g.
Text : John went to Japan. He recommends the travel.
Tag: Recommend(what=travel,where=japan)

How should the two separate sling documents be created in such case ?

Cell output references can be reseated

It is possible to compile a Flow so a cell output becomes a reference to a piece of memory managed by Myelin.

For example, consider a cell that contains a single Gather op with a ref=true output. The Gather copies the address of the selected row to the output instead of copying its content. Unfortunately, since the Gather's output is also the cell's output, that re-seats the output reference without writing to the output array.

The same issue also impacts the DragnnLookupSingle op, or any op that outputs a reference to internal data. AFAICT, the issue would also occur when using a connector/channel (instead of a manually marked ref=true cell output), because connectors are implemented as ref=true cell inputs and outputs.

I am currently working around this by checking if the address of the output ref has changed after running the cell, and copying content if the address changed. Michael's suggested fix is to make SingleGather and DragnnLookupSingle check whether the output is already a reference and return false from Supports() in that case. Then one of the other Gather/Lookup kernels that copy content will be used.

Building SLING on OSX

We currently only support Linux out-of-the-box. We do try to make SLING build with both GCC and clang, but some changes are needed to get it to build under OSX. We intend to use this issue to track and document problems with building SLING for OSX and solutions for this these. While we do not current intend to check these into the main code base, it can serve as a guide to those who want to use it on OSX anyway.

bazel build conll-to-sling missing dependencies

Why do I get missing dependency errors on files in sling/nop/parser after I have put that path in the package_path?

bazel build -c opt local/conll2003:conll-to-sling --package_path=%workspace%:sling/nlp/parser/:sling --verbose_failuresINFO: Analysed target //local/conll2003:conll-to-sling (0 packages loaded).
INFO: Found 1 target...
ERROR: /home/jack.peng/workspace/semantic/sling/sling/nlp/parser/BUILD:16:1: undeclared inclusion(s) in rule '//nlp/parser:parser-state':
this rule is missing dependency declarations for the following files included by 'nlp/parser/parser-state.cc':
'sling/nlp/parser/parser-state.h'
'sling/nlp/parser/parser-action.h'
Target //local/conll2003:conll-to-sling failed to build
INFO: Elapsed time: 1.406s, Critical Path: 1.24s
FAILED: Build did NOT complete successfully

Link local/conll2003/README.md not available

In the main readme.md, there is statement:

    See local/conll2003/README.md for instructions on how to train a parser.

However link for local/conll2003/README.md does not work. It points to nowhere.

Squeeze op: Check failed: op->indegree() == 2 (1 vs. 2)

I used myelin to load checkpoint file and created a flow file.
But when use
nn.Compile("/tmp/filename.flow", library);
It reported that

`F1127 18:56:40.688472 12985 precompute.cc:175] Check failed: op->indegree() == 2 (1 vs. 2)

*** Check failure stack trace: ***

@           0x46a5ba  google::LogMessage::Fail()

@           0x46b20a  google::LogMessage::SendToLog()

@           0x46a2b2  google::LogMessage::Flush()

@           0x46ba79  google::LogMessageFatal::~LogMessageFatal()

@           0x41506a  sling::myelin::ConstantFolding::Transform()

@           0x453cdd  sling::myelin::Flow::Transform()

@           0x45cbbd  sling::myelin::Flow::Analyze()

@           0x44fe5f  sling::myelin::Network::Compile()

@           0x405663  main

@     0x7fed45e51830  __libc_start_main

@           0x408df9  _start

Aborted (core dumped)
`

And I insert a LOG in precompute.cc:175, it shows that a Squeeze op's indegree is 1 not 2.

But in my training model, I just use
tf.squeeze(X, axis=[1], name='pred')
and
tf.squeeze(X, axis=1)

Why this error occur?

bazel-build /sling/nlp/parser/trainer:generate-master-spec not working

I am using python2.7 in virtualenv.
in the step of running the training script : I got this error:
C++ compilation of rule '//third_party/syntaxnet:syntaxnet' failed (Exit 1)
In file included from third_party/syntaxnet/dragnn/core/interfaces/component.h:21:0,
from third_party/syntaxnet/dragnn/core/component_registry.h:19,
from third_party/syntaxnet/dragnn/core/component_registry.cc:16:
third_party/syntaxnet/dragnn/core/input_batch_cache.h:24:46: fatal error: tensorflow/core/platform/logging.h: No such file or directory
compilation terminated.
I have tried to add TENSORFLOW path variable to my bashrc file and to my virtualenv by trying those commands:
export TENSORFLOW="/home/emna/anaconda2/lib/python2.7/site-packages/tensorflow/include/:$VIRTUAL_ENV"
export TENSORFLOW="/home/emna/anaconda2/lib/python2.7/site-packages/tensorflow/include/tensorflow"
But still bazel-build doesn't find the tensorflow files in my machine while they actually exist in this repository "/home/emna/anaconda2/lib/python2.7/site-packages/tensorflow/include".
One more thing my virtualenv path is "/root/PycharmProjects/textmining/...."
Any suggestions please ?

Custom tokenization

For a text like "I recommend to buy this house between $100/$115 psf", I can tokenize this in different ways (E.g. $ 100 / $ 115 OR $ 1 0 0 / $ 1 1 5 )

How can I customize tokenization during training and inference ?

Parser fails to identify coreference

Hi guys

Sorry to be on an issue-opening spree, but I've been having a bit more of a play, and it seems SLING with the pre-trained sempar.flow file doesn't seem to recognise coreference at all. If you try to parse the sentence 'The dog who I loved died yesterday', it gets that there is a loving frame, and a dying frame, but it thinks that the arg-2 of the loving frame is 'who', and the arg-1 of the dying frame is 'dog', and these are assigned different mentions by the parser.

Is this intended behaviour?

Kris

Frame ID reset during looping

Is there a way to extract the Frame IDs at the document level? At any other level, the frame IDs start from 1. This makes it difficult to link the text.

Check failed: index != -1(-1 vs. -1)

Hi again.

Thanks for the help you've been giving me so far. I've been using SLING successfully for training my own corpus with your help and prompt responses, but now there's a new error that pops up when I'm loading up a new corpus to train after making some amendments to my schema.

I would like to enquire about this error that pops up when training the corpus. Do you have an idea as to what the issue could be?

Here's the error message.

INFO:tensorflow:Initial cost at step 0: 0.963456
INFO:tensorflow:cost at step 100: 0.092812
[2018-03-22 09:59:23.978805: F sling/nlp/parser/trainer/transition-state.cc:83] Check failed: index != -1 (-1 vs. -1)EVOKE:len=9:/tr/entity/asset/index
./sling/nlp/parser/tools/train.sh: line 242: 20807 Aborted                 (core dumped) python sling/nlp/parser/tools/train.py --master_spec="${OUTPUT_FOLDER}/master_spec" --hyperparams="${HYPERPARAMS}" --output_folder=${OUTPUT_FOLDER} --flow=${OUTPUT_FOLDER}/sempar.flow --commons=${COMMONS} --train_corpus=${TRAIN_FILEPATTERN} --dev_corpus=${DEV_FILEPATTERN} --batch_size=${BATCH_SIZE} --report_every=${REPORT_EVERY} --train_steps=${TRAIN_STEPS} --logtostderr

Note: "/tr/entity/asset/index" is an entry in the schema.

Thanks again for your help.

Python Example

Hello, can you please provide an example of how to manually create(in python) the John loves Mary document without loading the love-frame from any store?

Python3 Support?

I've been using this (amazing) project a lot lately, and one thing I'd love is support with Python3. I'm interested in implementing support for this. One thing that would be helpful for me to know is the list of what's needed to do so, if you'd be kind enough to provide it.

From what I can tell, the list would include:

  • The standard minor modifications to every python file such that they work with both 2.7 and 3+.
  • Modifying the TensorFlow link to point to a python3 version.

I'm assuming this list is likely much more extensive than just the two points above. Could you tell me the remaining steps needed to get support for python3? I'd of course be happy to submit a PR if I get a working implementation. Thanks!

Word2Vec format

I did training of word2vec using gensim and saved it in original binary format (using save_word2vec_format) and 200 dimension. However when I pass that as the custom embedding, the code failed with /sling/util/embeddings.cc line 47 failed check.

Can I use gensim or I should use the original code for word2vec ?
Also I could not understand why is that line trying to check equality with \n ?

Trying out with error

I ran the example from README.md by docker on local machine.
Got an error.

Step 11/11 : RUN python trying_out.py
 ---> Running in a28255db8e26
[2018-05-29 04:06:00.219432: F sling/myelin/generator/expression.cc:999] Unsupported operation (sling/myelin/generator/vector-flt-sse.cc line 221)
Aborted (core dumped)
The command '/bin/sh -c python trying_out.py' returned a non-zero code: 134

Dockerfile:

FROM ubuntu:16.04
MAINTAINER sih4sing5hong5

RUN apt-get update # 0529
RUN apt-get install -y python python-pip wget

RUN pip install --upgrade pip
RUN pip install http://www.jbox.dk/sling/sling-1.0.0-cp27-none-linux_x86_64.whl

RUN mkdir /usr/local/sling
WORKDIR /usr/local/sling
RUN wget http://www.jbox.dk/sling/sempar.flow

COPY trying_out.py trying_out.py
RUN python trying_out.py

trying_out.py

import sling

parser = sling.Parser("sempar.flow")

text = raw_input("text: ")
doc = parser.parse(text)
print doc.frame.data(pretty=True)
for m in doc.mentions:
  print "mention", doc.phrase(m.begin, m.end)

Repository: https://github.com/sih4sing5hong5/sling_trying_out

Check failed: input_->ReadVarint64(&tag)

There's one weird problem.
I tried to train a model with my own zip corpora but I got this message when running the script:

`

INFO: Build completed successfully, 1 total action
F1123 17:51:04.615497 15134 decoder.cc:45] Check failed: input_->ReadVarint64(&tag) 

*** Check failure stack trace: ***

@           0x4dc49a  google::LogMessage::Fail()

@           0x4dd1aa  google::LogMessage::SendToLog()

@           0x4dc1c5  google::LogMessage::Flush()

@           0x4dda99  google::LogMessageFatal::~LogMessageFatal()

@           0x426b29  sling::Decoder::DecodeObject()

@           0x426f35  sling::Decoder::Decode()

@           0x426050  sling::nlp::DocumentSource::Next()

@           0x40b2e8  OutputActionTable()

@           0x407426  main

@     0x7fe1a36fd830  __libc_start_main

@           0x409669  _start

`
The weird part is that after I unzipped and then zipped again the corpora from conll and ran the script with these rezipped corpora, same message showed up.

Thanks for any help!

compilation issues of caspar

this is mac os 10.13.4.
compiles up to this point
ERROR: sling/sling/myelin/kernel/BUILD:34:1: C++ compilation of rule '//sling/myelin/kernel:sse' failed (Exit 1)
In file included from sling/myelin/kernel/sse.cc:15:
In file included from ./sling/myelin/kernel/sse.h:18:
In file included from ./sling/myelin/compute.h:23:
./sling/myelin/flow.h:104:1: error: redefinition of 'Traits'
TYPE_TRAIT(int64, DT_INT64);
^
./sling/myelin/flow.h:89:39: note: expanded from macro 'TYPE_TRAIT'
template<> inline const TypeTraits &Traits() {
^
./sling/myelin/flow.h:103:1: note: previous definition is here
TYPE_TRAIT(int64_t, DT_INT64);
^
./sling/myelin/flow.h:89:39: note: expanded from macro 'TYPE_TRAIT'
template<> inline const TypeTraits &Traits() {
^
./sling/myelin/flow.h:104:1: error: redefinition of 'Traits'
TYPE_TRAIT(int64, DT_INT64);
^
./sling/myelin/flow.h:92:39: note: expanded from macro 'TYPE_TRAIT'
template<> inline const TypeTraits &Traits<type *>() {
^
./sling/myelin/flow.h:103:1: note: previous definition is here
TYPE_TRAIT(int64_t, DT_INT64);
^
./sling/myelin/flow.h:92:39: note: expanded from macro 'TYPE_TRAIT'
template<> inline const TypeTraits &Traits<type *>() {
^
2 errors generated.
INFO: Elapsed time: 29.437s, Critical Path: 10.66s
INFO: 49 processes, local.
FAILED: Build did NOT complete successfully

The most similar command is
branch
git branch

  • caspar
    master

No file system: local/sempar/commons

Hi,

I'm trying to run following command from README:

./sling/nlp/parser/tools/train.sh --report_every=500 --train_steps=1000

I'm getting following error from sling/nlp/parser/tools/train.py:

INFO:tensorflow:Determining the training schedule...
INFO:tensorflow:Training schedule defined!
INFO:tensorflow:Starting training...
2017-12-18 13:08:53.942650: I third_party/syntaxnet/dragnn/core/ops/dragnn_op_kernels.cc:77] Creating new ComputeSessionPool in container handle: shared
[2017-12-18 13:08:53.945291: E ./sling/base/status.h:63] ERROR 1 : No file system: local/sempar/commons
[2017-12-18 13:08:53.945307: F sling/stream/file.cc:25] Check failed: File::Open(filename, "r", &file_) 
./sling/nlp/parser/tools/train.sh: line 259: 30661 Abort trap: 6           python sling/nlp/parser/tools/train.py --master_spec="${OUTPUT_FOLDER}/master_spec" --hyperparams="${HYPERPARAMS}" --output_folder=${OUTPUT_FOLDER} --flow=${OUTPUT_FOLDER}/sempar.flow --commons=${COMMONS} --train_corpus=${TRAIN_FILEPATTERN} --dev_corpus=${DEV_FILEPATTERN} --batch_size=${BATCH_SIZE} --report_every=${REPORT_EVERY} --train_steps=${TRAIN_STEPS} --logtostderr

It is failing when trying to run

# Make sure to re-initialize all underlying state.
sess.run(tf.global_variables_initializer())

File local/sempar/commons does exist in file system.

Also the python script does not have problem reading train_corpus and dev_corpus from the same folder.

# Read train and dev corpora.
print "Reading corpora..."
train_corpus = read_corpus(FLAGS.train_corpus)
dev_corpus = read_corpus(FLAGS.dev_corpus)

Is there some configuration I'm missing? Or the paths to files are resolved differently somehow?

os: Mac OS Sierra
gcc: 4.2.1
python: 2.7.1

Thanks.

Parsing SLING output in Python

Hi guys

One final question: is there a plan in the works to make the output of SLING easier to parse in Python? Naively, the output isn't JSON compatible (attribute names and values need to be double quoted), and I've tried to write a regex-based parser for SLING output, but it's a bit trickier than I'd like

Apologies for the spam!
Kris

Why c++ and not python ?

Lot of the code for SLING is in c++. Even the tensorflow training is done in c++.

With all respect to c++ and the efficiency it gives, is there any particular reason for not using python instead ?

Reason I ask this questions is if I want to use this model, and say I need to modify some parts of neural network or hyper parameters, with python it would be much simpler.

Request: allow action sequences that don't connect frames

I just solved an issue I'd been having that was resulting in the following error from trainer_lib.run_training (dragnn), via a call to train.sh with my own corpus:

InvalidArgumentError (see above for traceback): indices[0] = 0 is not in [0, 0)
	 [[Node: train-ff/Adam/update_ff/fixed_embedding_matrix_1/ScatterAdd = ScatterAdd[T=DT_FLOAT, Tindices=DT_INT64, _class=["loc:@ff/fixed_embedding_matrix_1"], use_locking=true, _device="/job:localhost/replica:0/task:0/cpu:0"](ff/fixed_embedding_matrix_1/Adam, train-ff/Adam/update_ff/fixed_embedding_matrix_1/Unique, train-ff/Adam/update_ff/fixed_embedding_matrix_1/mul_1, ^train-ff/Adam/update_ff/fixed_embedding_matrix_1/Assign)]]

This puzzled me, because I had run train.sh without issues on a toy training dataset I made in the same way. After a couple days of scratching my head, I found the solution and, in my opinion, this seems like a bug.

Root cause of my problem: I had SLING documents in my corpus that were essentially just named entity extraction labels (in addition to the doc tokens etc) -- documents with frames that only referenced some portion of the text, but did not reference other frames. For example:

Builder personBuilder(doc->store());
personBuilder.AddIsA("/my/schema/person");
Frame personFrame = personBuilder.Create();
doc->AddSpan(0, 1)->Evoke(personFrame);
doc->Update();

Solution: I just need to have at least one frame that has a slot whose value is another frame. I've tried many variations of this and that's always what fixes the error. To continue with the example above, if I then make another frame with a slot value referencing the personFrame, all runs fine:

Builder personBuilder(doc->store());
personBuilder.AddIsA("/my/schema/person");
Frame personFrame = personBuilder.Create();
doc->AddSpan(0, 1)->Evoke(personFrame);

Builder someBuilder(doc->store());
someBuilder.AddIsA("/my/schema/anotherthing");
someBuilder.Add("/my/schema/anotherthing/source", personFrame);
Frame someFrame = someBuilder.Create();
doc->AddSpan(1, 2)->Evoke(someFrame);

doc->Update();

So, my question is: is this intentional? I didn't see anything that indicated one must have at least one connection from one frame to another. Interesting in hearing your thoughts. Thanks!


More complete error output of train.sh (again, due to the trainer_lib.run_training call in train.py):

INFO:tensorflow:Training schedule defined!
INFO:tensorflow:Starting training...
2017-12-04 13:40:38.829931: I third_party/syntaxnet/dragnn/core/ops/dragnn_op_kernels.cc:78] Creating new ComputeSessionPool in container handle: shared
I1204 13:40:38.867660 30572 sempar-component.cc:58] lr_lstm: loaded 4753 words
I1204 13:40:38.867676 30572 sempar-component.cc:59] lr_lstm: loaded 0 prefixes
I1204 13:40:38.867679 30572 sempar-component.cc:60] lr_lstm: loaded 1663 suffixes
I1204 13:40:38.867681 30572 sempar-component.cc:62] Lexicon OOV: 0
I1204 13:40:38.867702 30572 sempar-component.cc:63] Lexicon normalize digits: 1
I1204 13:40:38.902065 30572 sempar-component.cc:58] rl_lstm: loaded 4753 words
I1204 13:40:38.902081 30572 sempar-component.cc:59] rl_lstm: loaded 0 prefixes
I1204 13:40:38.902101 30572 sempar-component.cc:60] rl_lstm: loaded 1663 suffixes
I1204 13:40:38.902102 30572 sempar-component.cc:62] Lexicon OOV: 0
I1204 13:40:38.902104 30572 sempar-component.cc:63] Lexicon normalize digits: 1
I1204 13:40:38.936980 30572 sempar-component.cc:58] ff: loaded 0 words
I1204 13:40:38.936996 30572 sempar-component.cc:59] ff: loaded 0 prefixes
I1204 13:40:38.937017 30572 sempar-component.cc:60] ff: loaded 0 suffixes
Traceback (most recent call last):
  File "nlp/parser/tools/train.py", line 245, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "nlp/parser/tools/train.py", line 232, in main
    checkpoint_filename=checkpoint_path)
  File "third_party/syntaxnet/dragnn/python/trainer_lib.py", line 119, in run_training
    sess, trainers[target_idx], train_corpus, batch_size)
  File "third_party/syntaxnet/dragnn/python/trainer_lib.py", line 60, in run_training_step
    feed_dict={trainer['input_batch']: batch})
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 895, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1124, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1321, in _do_run
    options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1340, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0] = 0 is not in [0, 0)
	 [[Node: train-ff/Adam/update_ff/fixed_embedding_matrix_1/ScatterAdd = ScatterAdd[T=DT_FLOAT, Tindices=DT_INT64, _class=["loc:@ff/fixed_embedding_matrix_1"], use_locking=true, _device="/job:localhost/replica:0/task:0/cpu:0"](ff/fixed_embedding_matrix_1/Adam, train-ff/Adam/update_ff/fixed_embedding_matrix_1/Unique, train-ff/Adam/update_ff/fixed_embedding_matrix_1/mul_1, ^train-ff/Adam/update_ff/fixed_embedding_matrix_1/Assign)]]

Caused by op u'train-ff/Adam/update_ff/fixed_embedding_matrix_1/ScatterAdd', defined at:
  File "nlp/parser/tools/train.py", line 245, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "nlp/parser/tools/train.py", line 176, in main
    trainers += [builder.add_training_from_config(target)]
  File "third_party/syntaxnet/dragnn/python/graph_builder.py", line 482, in add_training_from_config
    **kwargs)
  File "third_party/syntaxnet/dragnn/python/graph_builder.py", line 350, in build_training
    clipped_gradients, global_step=self.master_vars['step'])
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.py", line 456, in apply_gradients
    update_ops.append(processor.update_op(self, grad))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.py", line 102, in update_op
    return optimizer._apply_sparse_duplicate_indices(g, self._v)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.py", line 654, in _apply_sparse_duplicate_indices
    return self._apply_sparse(gradient_no_duplicate_indices, var)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/adam.py", line 197, in _apply_sparse
    lambda x, i, v: state_ops.scatter_add(  # pylint: disable=g-long-lambda
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/adam.py", line 181, in _apply_sparse_shared
    m_t = scatter_add(m, indices, m_scaled_g_values)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/adam.py", line 198, in <lambda>
    x, i, v, use_locking=self._use_locking))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_state_ops.py", line 210, in scatter_add
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): indices[0] = 0 is not in [0, 0)
	 [[Node: train-ff/Adam/update_ff/fixed_embedding_matrix_1/ScatterAdd = ScatterAdd[T=DT_FLOAT, Tindices=DT_INT64, _class=["loc:@ff/fixed_embedding_matrix_1"], use_locking=true, _device="/job:localhost/replica:0/task:0/cpu:0"](ff/fixed_embedding_matrix_1/Adam, train-ff/Adam/update_ff/fixed_embedding_matrix_1/Unique, train-ff/Adam/update_ff/fixed_embedding_matrix_1/mul_1, ^train-ff/Adam/update_ff/fixed_embedding_matrix_1/Assign)]]

2017-12-04 13:40:39.153166: I third_party/syntaxnet/dragnn/core/compute_session_pool.cc:55] Destroying pool: total number of sessions created = 1
2017-12-04 13:40:39.153191: W third_party/syntaxnet/dragnn/core/compute_session_pool.cc:58] Destroying pool: number of unreturned sessions = 1

Frame definition not used during SLING runtime.

In the following program, I tried to analyse semantics of the sentence "Ted Barter likes software engineering". But in the output there is line

    /s/document/mention: {=#7   :/s/phrase    /s/phrase/begin: 0    
            /s/phrase/length: 2    /s/phrase/evokes: {=#8     :/saft/person   } 
    }

I expect not "/saft/person" but "/toy/person" as defined at the line 23 in the program. How are used frames in the SLING? What needs to be done to get "/toy/person" for the "Ted Barter" on the output? Listing of code follows, as well as listing of output:

PROGRAM

#include <string>
#include <stdlib.h>
#include <iostream>

#include "sling/frame/object.h"
#include "sling/frame/store.h"
#include "sling/frame/serialization.h"
#include "sling/nlp/document/document-tokenizer.h"
#include "sling/nlp/parser/parser.h"

int main() {


	// Load parser model.
	sling::Store commons;
	sling::nlp::Parser parser;
	parser.Load(&commons, "/tmp/sempar.flow");


	sling::Builder homer(&commons);

	homer.AddId("/en/barter");
	homer.AddIsA("/toy/person");
	homer.Add("name", "Ted Barter");
	homer.Add("/toy/person/age", 59);
	homer.AddLink("/toy/person/place", "/en/springfield");
	homer.AddLink("/toy/person/spouse", "/en/julia");
	homer.AddLink("/toy/person/child", "/en/bart");
	homer.AddLink("/toy/person/child", "/en/lisa");
	homer.AddLink("/toy/person/child", "/en/maggie");
	homer.AddLink("/toy/person/employer", "/en/snpp");

	sling::Frame barterPerson = homer.Create(); 



	commons.Freeze();

	// Create document tokenizer.
	sling::nlp::DocumentTokenizer tokenizer;

	// Create frame store for document.
	sling::Store store(&commons);



	sling::nlp::Document document(&store);

	// Tokenize text.
	string text = "Ted Barter likes software engineering";
	tokenizer.Tokenize(&document, text);

	// Parse document.
	parser.Parse(&document);
	document.Update();

	// Output document annotations.
	std::cout << sling::ToText(document.top(), 2);

}

OUTPUT

{=#1 
  :/s/document
  /s/document/text: "Ted Barter likes software engineering"
  /s/document/tokens: [{=#2 
	:/s/token
	/s/token/index: 0
	/s/token/text: "Ted"
	/s/token/start: 0
	/s/token/length: 3
	/s/token/break: 0
  }, {=#3 
	:/s/token
	/s/token/index: 1
	/s/token/text: "Barter"
	/s/token/start: 4
	/s/token/length: 6
  }, {=#4 
	:/s/token
	/s/token/index: 2
	/s/token/text: "likes"
	/s/token/start: 11
	/s/token/length: 5
  }, {=#5 
	:/s/token
	/s/token/index: 3
	/s/token/text: "software"
	/s/token/start: 17
	/s/token/length: 8
  }, {=#6 
	:/s/token
	/s/token/index: 4
	/s/token/text: "engineering"
	/s/token/start: 26
	/s/token/length: 11
  }]
  /s/document/mention: {=#7 
	:/s/phrase
	/s/phrase/begin: 0
	/s/phrase/length: 2
	/s/phrase/evokes: {=#8 
	  :/saft/person
	}
  }
  /s/document/mention: {=#9 
	:/s/phrase
	/s/phrase/begin: 2
	/s/phrase/evokes: {=#10 
	  :/pb/like-02
	  /pb/arg0: #8
	  /pb/arg1: {=#11 
		:/pb/engineer-01
		/pb/arg1: {=#12 
		  :/saft/other
		}
	  }
	}
  }
  /s/document/mention: {=#13 
	:/s/phrase
	/s/phrase/begin: 3
	/s/phrase/length: 2
	/s/phrase/evokes: #12
  }
  /s/document/mention: {=#14 
	:/s/phrase
	/s/phrase/begin: 4
	/s/phrase/evokes: #11
  }
}

Docker - which versions of c++, linux, python, tensorflow etc ?

I recently faced #115 and am still struggling with it. Hence questions I have is about docker to make it easy to experiment with training for sling.

I can help to create docker image if you let me know which versions of dependencies are required.

  1. linux
  2. python
  3. bazel
  4. gcc
  5. g++
  6. tensorflow

Abnormity when training data

INFO:tensorflow:Determining the training schedule...
INFO:tensorflow:Training schedule defined!
INFO:tensorflow:Starting training...
2018-04-02 22:18:53.801752: I third_party/syntaxnet/dragnn/core/ops/dragnn_op_kernels.cc:77] Creating new ComputeSessionPool in container handle: shared
2018-04-02 22:18:53.811103: I sling/nlp/parser/trainer/sempar-component.cc:58] lr_lstm: loaded 0 words
2018-04-02 22:18:53.811145: I sling/nlp/parser/trainer/sempar-component.cc:59] lr_lstm: loaded 0 prefixes
2018-04-02 22:18:53.811151: I sling/nlp/parser/trainer/sempar-component.cc:60] lr_lstm: loaded 0 suffixes
2018-04-02 22:18:53.811236: I sling/nlp/parser/trainer/sempar-component.cc:58] rl_lstm: loaded 0 words
2018-04-02 22:18:53.811246: I sling/nlp/parser/trainer/sempar-component.cc:59] rl_lstm: loaded 0 prefixes
2018-04-02 22:18:53.811251: I sling/nlp/parser/trainer/sempar-component.cc:60] rl_lstm: loaded 0 suffixes
2018-04-02 22:18:53.811311: I sling/nlp/parser/trainer/sempar-component.cc:58] ff: loaded 0 words
2018-04-02 22:18:53.811320: I sling/nlp/parser/trainer/sempar-component.cc:59] ff: loaded 0 prefixes
2018-04-02 22:18:53.811324: I sling/nlp/parser/trainer/sempar-component.cc:60] ff: loaded 0 suffixes
./sling/nlp/parser/tools/train.sh: 行 242: 6496 段错误 python sling/nlp/parser/tools/train.py --master_spec="${OUTPUT_FOLDER}/master_spec" --hyperparams="${HYPERPARAMS}" --output_folder=${OUTPUT_FOLDER} --flow=${OUTPUT_FOLDER}/sempar.flow --commons=${COMMONS} --train_corpus=${TRAIN_FILEPATTERN} --dev_corpus=${DEV_FILEPATTERN} --batch_size=${BATCH_SIZE} --report_every=${REPORT_EVERY} --train_steps=${TRAIN_STEPS} --logtostderr

sempar.so: undefined symbol

I am a sys admin trying to set up sling for one of our researchers.
I am running sling on a sample and I am getting this error from sempar.so
Running python 2.7.12 and tensorflow 1.4.0, data gets generated (tables) in the output dir.
Any ideas as to what might be missing and how to resolve the issue?
Thanks.

./nlp/parser/tools/train.sh --commons=/tmp/sling/sling/local/conll2003/commons --train=/tmp/sling/sling/local/conll2003/eng.train.zip --dev=/tmp/sling/sling/local/conll2003/eng.testa.zip --word_embeddings=/tmp/sling/sling/local/embeddings/word2vec-32-embeddings.bin --train_steps=10000 --output=/tmp/sempar-conll
Writing command to /tmp/sempar-conll/command
INFO: Found 1 target...
Target //sling/nlp/parser/trainer:generate-master-spec up-to-date:
bazel-bin/sling/nlp/parser/trainer/generate-master-spec
INFO: Elapsed time: 0.102s, Critical Path: 0.00s
[2018-01-17 10:26:20.899950: I sling/nlp/parser/trainer/generate-master-spec.cc:145] 10000 documents processed.
[2018-01-17 10:26:21.146334: I sling/nlp/parser/trainer/generate-master-spec.cc:148] Processed 14041 documents.
[2018-01-17 10:26:21.149566: I sling/nlp/parser/trainer/generate-master-spec.cc:156] Wrote action table to /tmp/sempar-conll/table, /tmp/sempar-conll/table.summary, /tmp/sempar-conll/table.unknown_symbols
[2018-01-17 10:26:21.585311: I sling/nlp/parser/trainer/generate-master-spec.cc:286] 10000 documents processsed while building lexicons
[2018-01-17 10:26:21.762788: I sling/nlp/parser/trainer/generate-master-spec.cc:308] 14041 documents processsed while building lexicon
[2018-01-17 10:26:21.765947: I sling/nlp/parser/trainer/generate-master-spec.cc:411] Using pretrained word embeddings: /tmp/sling/sling/local/embeddings/word2vec-32-embeddings.bin
[2018-01-17 10:26:21.766598: I sling/nlp/parser/trainer/generate-master-spec.cc:420] Wrote master spec to /tmp/sempar-conll/master_spec
INFO: Found 1 target...
Target //sling/nlp/parser/tools:evaluate-frames up-to-date:
bazel-bin/sling/nlp/parser/tools/evaluate-frames
INFO: Elapsed time: 0.089s, Critical Path: 0.00s
INFO: Found 1 target...
Target //sling/nlp/parser/trainer:sempar.so up-to-date:
bazel-bin/sling/nlp/parser/trainer/sempar.so
INFO: Elapsed time: 0.106s, Critical Path: 0.00s
Traceback (most recent call last):
File "sling/nlp/parser/tools/train.py", line 28, in
from convert import convert_model
File "/tmp/sling/sling/nlp/parser/tools/convert.py", line 33, in
tf.load_op_library("/tmp/sling/bazel-bin/sling/nlp/parser/trainer/sempar.so")
File "/var/local/miniconda2/lib/python2.7/site-packages/tensorflow/python/framework/load_library.py", line 56, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename, status)
File "/var/local/miniconda2/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: /tmp/sling/bazel-bin/sling/nlp/parser/trainer/sempar.so: undefined symbol: _ZN10tensorflow8internal21CheckOpMessageBuilder9NewStringB5cxx11Ev

macOS: no such package 'third_party/gflags'

$ virtualenv .env
$ . .env/bin/activate
$ pip install https://ci.tensorflow.org/view/tf-nightly/job/tf-nightly-mac/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=mac-slave/lastSuccessfulBuild/artifact/pip_test/whl/tf_nightly-1.head-py2-none-any.whl
$ pip install -U protobuf==3.3.0
(.env) [loretoparisi@:mbploreto sling]$ bazel build -c opt nlp/parser nlp/parser/tools:all
ERROR: /sling/WORKSPACE:8:1: no such package 'third_party/gflags': BUILD file not found on package path and referenced by '//external:gflags'.
ERROR: Analysis of target '//nlp/parser/tools:evaluate-frames' failed; build aborted: Loading failed.
INFO: Elapsed time: 0,108s

I have

(.env) [loretoparisi@:mbploreto third_party]$ tree -L 1
.
├── bz2lib
├── gflags
├── glog
├── jit
├── syntaxnet
├── tensorflow
└── zlib

Languages Support

Hello,

First of all thanks for the amazing job,

Is there German language support available?

Thanx in advance,
Duygu.

Documentation would be lot clearer if frame documentation link is included in sling readme

I highly recommend to include link to documentation of frames in root Sling readme file.

Also it would be great if some API documentation of Store, Parser, ParserAction etc based on doxyzen (javadoc style) was included. It would make much clearer what's happening underneath without needing to read the c++ or header files. Let me know if I could help with these two things ? I would be glad to help. Alternatively a script to generate the doxygen output would be a good pointer.

Present readme is great at introducing the ML portions of Sling (training, parsing, deep learning model and link to paper describing more details), while giving insufficiently high level concept of frame and absolutely no concept around store, even though store is mentioned quite a few times it is not very clear what it is (I imagined it is a folder but turns out it is a in memory database of sorts which supports the parser). This makes actually using it for training a new model less accessible. By including link to frames readme file in root readme and adding doxygen documentation, these would become clearer.

I hope by improving the documentation in readme, the framework can be more accessible for more people.

How to compile and run Parsing code?

Hi,

I follow the Parsing part in README

and try to compile the code below by gcc command
gcc -I. parsing.cc -o parsing

#include "sling/frame/store.h"
#include "sling/nlp/document/document-tokenizer.h"
#include "sling/nlp/parser/parser.h"
int main{
	// Load parser model.
	sling::Store commons;
	sling::nlp::Parser parser;
	parser.Load(&commons, "/tmp/sempar.flow");
	commons.Freeze();

	// Create document tokenizer.
	sling::nlp::DocumentTokenizer tokenizer;
}

but somehow I receive a lot of error...

parsing.cc:(.text+0x26): undefined reference to `sling::Store::Store()'
parsing.cc:(.text+0x44): undefined reference to `std::allocator<char>::allocator()'
parsing.cc:(.text+0x61): undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)'
parsing.cc:(.text+0x81): undefined reference to `sling::nlp::Parser::Load(sling::Store*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
parsing.cc:(.text+0x90): undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()'
parsing.cc:(.text+0x9f): undefined reference to `std::allocator<char>::~allocator()'
parsing.cc:(.text+0xae): undefined reference to `sling::Store::Freeze()'
parsing.cc:(.text+0xbd): undefined reference to `sling::nlp::DocumentTokenizer::DocumentTokenizer()'
parsing.cc:(.text+0xef): undefined reference to `sling::Store::~Store()'
parsing.cc:(.text+0x114): undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()'
parsing.cc:(.text+0x128): undefined reference to `std::allocator<char>::~allocator()'
parsing.cc:(.text+0x150): undefined reference to `sling::Store::~Store()'
/tmp/ccR0GqxC.o: In function `sling::Name::Name(sling::Names&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)':
...
...

Is this the right way?

Inconsistent word between token and mention

Hi,
when i try to parse sentences, some mentions labeled word different from input sentence.
for example: "stun" -> "twist" or "stun" -> "kick"

{=#1
:/s/document
/s/document/text: "This speech seemed to stun the king"
/s/document/tokens: [{=#2
:/s/token
/s/token/index: 0
/s/token/text: "This"
/s/token/start: 0
/s/token/length: 4
/s/token/break: 0
}, {=#3
:/s/token
/s/token/index: 1
/s/token/text: "speech"
/s/token/start: 5
/s/token/length: 6
}, {=#4
:/s/token
/s/token/index: 2
/s/token/text: "seemed"
/s/token/start: 12
/s/token/length: 6
}, {=#5
:/s/token
/s/token/index: 3
/s/token/text: "to"
/s/token/start: 19
/s/token/length: 2
}, {=#6
:/s/token
/s/token/index: 4
/s/token/text: "stun"
/s/token/start: 22
/s/token/length: 4
}, {=#7
:/s/token
/s/token/index: 5
/s/token/text: "the"
/s/token/start: 27
/s/token/length: 3
}, {=#8
:/s/token
/s/token/index: 6
/s/token/text: "king"
/s/token/start: 31
/s/token/length: 4
}]
/s/document/mention: {=#9
:/s/phrase
/s/phrase/begin: 1
/s/phrase/evokes: {=#10
:/saft/event
}
/s/phrase/evokes: {=#11
:/pb/speak-01
}
}
/s/document/mention: {=#12
:/s/phrase
/s/phrase/begin: 2
/s/phrase/evokes: {=#13
:/pb/seem-01
/pb/arg1: #10
/pb/arg1: {=#14
:/pb/twist-01
/pb/arg0: #10
/pb/arg1: {=#15
:/saft/person
}
}
}
}
/s/document/mention: {=#16
:/s/phrase
/s/phrase/begin: 4
/s/phrase/evokes: #14
}
/s/document/mention: {=#17
:/s/phrase
/s/phrase/begin: 6
/s/phrase/evokes: #15
}
}

"stun" -> "kick"

{=#1
:/s/document
/s/document/text: "stun"
/s/document/tokens: [{=#2
:/s/token
/s/token/index: 0
/s/token/text: "stun"
/s/token/start: 0
/s/token/length: 4
/s/token/break: 0
}]
/s/document/mention: {=#3
:/s/phrase
/s/phrase/begin: 0
/s/phrase/evokes: {=#4
:/pb/kick-01
}
}
}

How did it connect to other word while parsing?

./sling/nlp/parser/tools/train.sh: line 242: 29662 Segmentation fault (core dumped)

Hi,
I am trying to run example provided inside conll2003 which you mentioned in the description. I am getting following message please help me what is going wrong I have tried some solutions online for segmentation fault but cant make it work.
Also I made change in train.sh according to your this answer #84.
Following error:

16:53 $ ./sling/nlp/parser/tools/train.sh --commons=local/conll2003/commons --train=local/conll2003/eng.train.zip --dev=local/conll2003/eng.testa.zip --word_embeddings=local/embeddings/word2vec-32-embeddings.bin --report_every=5000 --train_steps=10000 --output=/tmp/sempar-conll
Writing command to /tmp/sempar-conll/command
INFO: Analysed target //sling/nlp/parser/trainer:generate-master-spec (0 packages loaded).
INFO: Found 1 target...
Target //sling/nlp/parser/trainer:generate-master-spec up-to-date:
  bazel-bin/sling/nlp/parser/trainer/generate-master-spec
INFO: Elapsed time: 0.374s, Critical Path: 0.00s
INFO: Build completed successfully, 1 total action
[2017-12-13 16:53:38.929707: I sling/nlp/parser/trainer/generate-master-spec.cc:145] 10000 documents processed.
[2017-12-13 16:53:39.116543: I sling/nlp/parser/trainer/generate-master-spec.cc:148] Processed 14041 documents.
[2017-12-13 16:53:39.122819: I sling/nlp/parser/trainer/generate-master-spec.cc:156] Wrote action table to /tmp/sempar-conll/table, /tmp/sempar-conll/table.summary, /tmp/sempar-conll/table.unknown_symbols
[2017-12-13 16:53:39.432906: I sling/nlp/parser/trainer/generate-master-spec.cc:286] 10000 documents processsed while building lexicons
[2017-12-13 16:53:39.556438: I sling/nlp/parser/trainer/generate-master-spec.cc:308] 14041 documents processsed while building lexicon
[2017-12-13 16:53:39.559534: I sling/nlp/parser/trainer/generate-master-spec.cc:411] Using pretrained word embeddings: local/embeddings/word2vec-32-embeddings.bin
[2017-12-13 16:53:39.559956: I sling/nlp/parser/trainer/generate-master-spec.cc:420] Wrote master spec to /tmp/sempar-conll/master_spec
INFO: Analysed target //sling/nlp/parser/tools:evaluate-frames (0 packages loaded).
INFO: Found 1 target...
Target //sling/nlp/parser/tools:evaluate-frames up-to-date:
  bazel-bin/sling/nlp/parser/tools/evaluate-frames
INFO: Elapsed time: 0.153s, Critical Path: 0.00s
INFO: Build completed successfully, 1 total action
INFO: Analysed target //sling/nlp/parser/trainer:sempar.so (0 packages loaded).
INFO: Found 1 target...
Target //sling/nlp/parser/trainer:sempar.so up-to-date:
  bazel-bin/sling/nlp/parser/trainer/sempar.so
INFO: Elapsed time: 0.342s, Critical Path: 0.01s
INFO: Build completed successfully, 1 total action
./sling/nlp/parser/tools/train.sh: line 242: 29662 Segmentation fault      (core dumped) python sling/nlp/parser/tools/train.py --master_spec="${OUTPUT_FOLDER}/master_spec" --hyperparams="${HYPERPARAMS}" --output_folder=${OUTPUT_FOLDER} --flow=${OUTPUT_FOLDER}/sempar.flow --commons=${COMMONS} --train_corpus=${TRAIN_FILEPATTERN} --dev_corpus=${DEV_FILEPATTERN} --batch_size=${BATCH_SIZE} --report_every=${REPORT_EVERY} --train_steps=${TRAIN_STEPS} --logtostderr

How do I link sling parser library in my program

How can i include the parsing functions in my own program? which libraries do i need to link? I can see only bazel-bin/sling/nlp/parser/libparser[.so|.a] but still there are lot of undefined references..

'
./libparser.a(parser.o): In function sling::HandleSpace::~HandleSpace()': parser.cc:(.text._ZN5sling11HandleSpaceD2Ev[_ZN5sling11HandleSpaceD5Ev]+0x14): undefined reference to sling::External::~Extern
al()'
.//libparser.a(parser.o): In function sling::HandleSpace::~HandleSpace()': parser.cc:(.text._ZN5sling11HandleSpaceD0Ev[_ZN5sling11HandleSpaceD5Ev]+0x14): undefined reference to sling::External::~Extern
al()'
.//libparser.a(parser.o): In function sling::Handles::~Handles()': parser.cc:(.text._ZN5sling7HandlesD2Ev[_ZN5sling7HandlesD5Ev]+0x14): undefined reference to sling::External::~External()'
.//libparser.a(parser.o): In function sling::Handles::~Handles()': parser.cc:(.text._ZN5sling7HandlesD0Ev[_ZN5sling7HandlesD5Ev]+0x14): undefined reference to sling::External::~External()'
.//libparser.a(parser.o): In function sling::nlp::Parser::EnableGPU()': parser.cc:(.text._ZN5sling3nlp6Parser9EnableGPUEv+0xd): undefined reference to sling::myelin::CUDA::Supported()'
parser.cc:(.text._ZN5sling3nlp6Parser9EnableGPUEv+0x43): undefined reference to sling::myelin::CUDARuntime::Connect(int)' .//libparser.a(parser.o): In function sling::nlp::ParserInstance::ParserInstance(sling::nlp::Parser const*, sling::nlp::Docume
nt*, int, int)':

Pretrained Model

Hello, this is more a question rather than an issue.

I downloaded the pre-trained model and I have been testing it.
I was wondering if this pre-trained model is the one you evaluate in the paper,
or if it is a toy model example.

Thank you very much,
Kind Regards!

No kernel supports rl_lstm/MatMul_7 of type MatMul

I've extracted some of the steps for running train.sh and parse into a Dockerfile:

https://github.com/RobotsAndPencils/sling-docker/blob/master/Dockerfile

but have run into:

I1117 23:52:39.514359     1 parse.cc:131] Load parser from /tmp/sempar-conll/sempar.flow
E1117 23:52:39.529886     1 compute.cc:979] No kernel supports rl_lstm/MatMul_7 of type MatMul
F1117 23:52:39.529938     1 parser.cc:67] Check failed: network_.Compile(flow, library_) 

(I'm not clear on whether it's necessary to install the kernels, or perhaps specific GPU support is required. Apologies in advance, since I'm not yet familiar with several of the packages used.)

build error in mac os10.12.5

(py2.7) ericliudeMacBook-Pro:sling ericliu$ bazel build -c opt nlp/parser nlp/parser/tools:all
INFO: Found 4 targets...
ERROR: /Users/ericliu/dlnlp/sling/frame/BUILD:30:1: C++ compilation of rule '//frame:object' failed (Exit 1).
In file included from frame/object.cc:15:
./frame/object.h:19:10: fatal error: 'hash_map' file not found
#include <hash_map>
^
1 error generated.
INFO: Elapsed time: 1.624s, Critical Path: 1.28s
(py2.7) ericliudeMacBook-Pro:sling ericliu$

Segmentation Fault while training my own dataset

Hi,

I've been experimenting with SLING to train my own dataset, and this error keeps coming up when running the training script train.sh. This seems to be a similar error to #115 , but in my case I managed to train the conll dataset successfully.

I'm using:

  • ubuntu 16.04.3
  • tensorflow 1.4.0
  • python 2.7.12
  • gcc 5.4.0

This is the error log from stdout:

INFO:tensorflow:Determining the training schedule...
INFO:tensorflow:Training schedule defined!
INFO:tensorflow:Starting training...
2018-03-01 07:16:16.136817: I third_party/syntaxnet/dragnn/core/ops/dragnn_op_kernels.cc:77] Creating new ComputeSessionPool in container handle: shared
2018-03-01 07:16:16.137167: I sling/nlp/parser/trainer/sempar-component.cc:58] lr_lstm: loaded 0 words
2018-03-01 07:16:16.137282: I sling/nlp/parser/trainer/sempar-component.cc:59] lr_lstm: loaded 0 prefixes
2018-03-01 07:16:16.137369: I sling/nlp/parser/trainer/sempar-component.cc:60] lr_lstm: loaded 0 suffixes
2018-03-01 07:16:16.137565: I sling/nlp/parser/trainer/sempar-component.cc:58] rl_lstm: loaded 0 words
2018-03-01 07:16:16.137658: I sling/nlp/parser/trainer/sempar-component.cc:59] rl_lstm: loaded 0 prefixes
2018-03-01 07:16:16.137741: I sling/nlp/parser/trainer/sempar-component.cc:60] rl_lstm: loaded 0 suffixes
2018-03-01 07:16:16.137913: I sling/nlp/parser/trainer/sempar-component.cc:58] ff: loaded 0 words
2018-03-01 07:16:16.138005: I sling/nlp/parser/trainer/sempar-component.cc:59] ff: loaded 0 prefixes
2018-03-01 07:16:16.138087: I sling/nlp/parser/trainer/sempar-component.cc:60] ff: loaded 0 suffixes
./sling/nlp/parser/tools/train.sh: line 243: 16056 Segmentation fault      (core dumped) python sling/nlp/parser/tools/train.py --master_spec="${OUTPUT_FOLDER}/master_spec" --hyperparams="${HYPERPARAMS}" --output_folder=${OUTPUT_FOLDER} --flow=${OUTPUT_FOLDER}/sempar.flow --commons=${COMMONS} --train_corpus=${TRAIN_FILEPATTERN} --dev_corpus=${DEV_FILEPATTERN} --batch_size=${BATCH_SIZE} --report_every=${REPORT_EVERY} --train_steps=${TRAIN_STEPS} --logtostderr

I've also attached the following below inside files.zip in hope that you can replicate and possibly troubleshoot the issue that I have.

files.zip

  • commons store : commons
  • training dataset: train.rec
  • test dataset: test.rec

Thanks for the help!

train.sh: undefined symbol: _ZN10tensorflow8internal21CheckOpMessageBuilder9NewStringB5cxx11Ev

I've gone through the README step-by-step about 4 times now and keep getting the following error upon running ./nlp/parser/tools/train.sh:

Traceback (most recent call last):
  File "nlp/parser/tools/train.py", line 28, in <module>
    from convert import convert_model
  File "/home/brandon/Documents/Forge/resources/sling/nlp/parser/tools/convert.py", line 30, in <module>
    tf.load_op_library("bazel-bin/nlp/parser/trainer/sempar.so")
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/load_library.py", line 64, in load_op_library
    None, None, error_msg, error_code)
tensorflow.python.framework.errors_impl.NotFoundError: bazel-bin/nlp/parser/trainer/sempar.so: undefined symbol: _ZN10tensorflow8internal21CheckOpMessageBuilder9NewStringB5cxx11Ev

This seems related to pywrap. I've confirmed that the file /usr/local/lib/python2.7/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so exists and I installed everything exactly as specified by the README (same .whl tensorflow pip install and all). Everything seems to exist in the correct locations and with the specified versions, but I get the error nonetheless. The error is unambiguously due to the tf.load_op_library call.

Not sure how helpful this is, but this tensorflow issue seems relevant. Thanks for any help!

For reference, the entire command + output:

./nlp/parser/tools/train.sh 
Writing command to /home/brandon/Documents/Forge/resources/sempar_ontonotes/out/command
INFO: Found 1 target...
Target //nlp/parser/trainer:generate-master-spec up-to-date:
  bazel-bin/nlp/parser/trainer/generate-master-spec
INFO: Elapsed time: 0.376s, Critical Path: 0.15s
I1109 14:26:46.558984 21875 generate-master-spec.cc:145] 1 documents processed.
I1109 14:26:46.914839 21875 generate-master-spec.cc:145] 10001 documents processed.
I1109 14:26:47.060022 21875 generate-master-spec.cc:148] Processed 14041 documents.
I1109 14:26:47.061583 21875 generate-master-spec.cc:156] Wrote action table to /home/brandon/Documents/Forge/resources/sempar_ontonotes/out/table, /home/brandon/Documents/Forge/resources/sempar_ontonotes/out/table.summary, /home/brandon/Documents/Forge/resources/sempar_ontonotes/out/table.unknown_symbols
I1109 14:26:47.062510 21875 generate-master-spec.cc:286] 1 documents processsed while building lexicons
I1109 14:26:47.286675 21875 generate-master-spec.cc:286] 10001 documents processsed while building lexicons
I1109 14:26:47.379338 21875 generate-master-spec.cc:308] 14041 documents processsed while building lexicon
I1109 14:26:47.381319 21875 generate-master-spec.cc:413] No pretrained word embeddings specified
I1109 14:26:47.381635 21875 generate-master-spec.cc:420] Wrote master spec to /home/brandon/Documents/Forge/resources/sempar_ontonotes/out/master_spec
INFO: Found 1 target...
Target //nlp/parser/tools:evaluate-frames up-to-date:
  bazel-bin/nlp/parser/tools/evaluate-frames
INFO: Elapsed time: 0.163s, Critical Path: 0.00s
INFO: Found 1 target...
Target //nlp/parser/trainer:sempar.so up-to-date:
  bazel-bin/nlp/parser/trainer/sempar.so
INFO: Elapsed time: 0.241s, Critical Path: 0.00s
Traceback (most recent call last):
  File "nlp/parser/tools/train.py", line 28, in <module>
    from convert import convert_model
  File "/home/brandon/Documents/Forge/resources/sling/nlp/parser/tools/convert.py", line 30, in <module>
    tf.load_op_library("bazel-bin/nlp/parser/trainer/sempar.so")
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/load_library.py", line 64, in load_op_library
    None, None, error_msg, error_code)
tensorflow.python.framework.errors_impl.NotFoundError: bazel-bin/nlp/parser/trainer/sempar.so: undefined symbol: _ZN10tensorflow8internal21CheckOpMessageBuilder9NewStringB5cxx11Ev

How to create/use slot ids from python API

Hi,

I was wondering if it is possible to assign or use slot ids, when using the pretrained 'sempar.flow', on a text example with the python api. If I understood correctly, the '#' ids are temporary ones, which means that I cannot use them to properly reference the mentions connected with verbs.

Thanks in advance for your time

installation error

compiling with bazel 'bazel build -c opt sling/nlp/parser sling/nlp/parser/tools:all'

I got the following error:

INFO: Analysed 4 targets (25 packages loaded). INFO: Found 4 targets... ERROR: /Users/andrea/Desktop/AItech/projects/querlo/test_libraries/sling/sling-master/sling/myelin/kernel/BUILD:73:1: C++ compilation of rule '//sling/myelin/kernel:tensorflow' failed (Exit 1) In file included from sling/myelin/kernel/tensorflow.cc:15: In file included from ./sling/myelin/kernel/tensorflow.h:18: In file included from ./sling/myelin/compute.h:23: ./sling/myelin/flow.h:101:1: error: redefinition of 'Traits' TYPE_TRAIT(int64, DT_INT64); ^ ./sling/myelin/flow.h:86:39: note: expanded from macro 'TYPE_TRAIT' template<> inline const TypeTraits &Traits<type>() { \ ^ ./sling/myelin/flow.h:100:1: note: previous definition is here TYPE_TRAIT(int64_t, DT_INT64); ^ ./sling/myelin/flow.h:86:39: note: expanded from macro 'TYPE_TRAIT' template<> inline const TypeTraits &Traits<type>() { \ ^ ./sling/myelin/flow.h:101:1: error: redefinition of 'Traits' TYPE_TRAIT(int64, DT_INT64); ^ ./sling/myelin/flow.h:89:39: note: expanded from macro 'TYPE_TRAIT' template<> inline const TypeTraits &Traits<type *>() { \ ^ ./sling/myelin/flow.h:100:1: note: previous definition is here TYPE_TRAIT(int64_t, DT_INT64); ^ ./sling/myelin/flow.h:89:39: note: expanded from macro 'TYPE_TRAIT' template<> inline const TypeTraits &Traits<type *>() { \ ^ 2 errors generated. INFO: Elapsed time: 25,780s, Critical Path: 5,81s INFO: 2 processes, local. FAILED: Build did NOT complete successfully

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.