guillaumegenthial / sequence_tagging Goto Github PK

View Code? Open in Web Editor NEW

1.9K 73.0 707.0 56 KB

Named Entity Recognition (LSTM + CRF) - Tensorflow

Home Page: https://guillaumegenthial.github.io/sequence-tagging-with-tensorflow.html

License: Apache License 2.0

Python 99.53% Makefile 0.47%

named-entity-recognition crf tensorflow bi-lstm characters-embeddings glove ner conditional-random-fields state-of-art

sequence_tagging's Introduction

Named Entity Recognition with Tensorflow

This repo implements a NER model using Tensorflow (LSTM + CRF + chars embeddings).

A better implementation is available here, using tf.data and tf.estimator, and achieves an F1 of 91.21

State-of-the-art performance (F1 score between 90 and 91).

Check the blog post

Task

Given a sentence, give a tag to each word. A classical application is Named Entity Recognition (NER). Here is an example

John   lives in New   York
B-PER  O     O  B-LOC I-LOC

Model

Similar to Lample et al. and Ma and Hovy.

concatenate final states of a bi-lstm on character embeddings to get a character-based representation of each word
concatenate this representation to a standard word vector representation (GloVe here)
run a bi-lstm on each sentence to extract contextual representation of each word
decode with a linear chain CRF

Getting started

Download the GloVe vectors with

make glove

Alternatively, you can download them manually here and update the glove_filename entry in config.py. You can also choose not to load pretrained word vectors by changing the entry use_pretrained to False in model/config.py.

Build the training data, train and evaluate the model with

make run

Details

Here is the breakdown of the commands executed in make run:

[DO NOT MISS THIS STEP] Build vocab from the data and extract trimmed glove vectors according to the config in model/config.py.

python build_data.py

Train the model with

python train.py

Evaluate and interact with the model with

python evaluate.py

Data iterators and utils are in model/data_utils.py and the model with training/test procedures is in model/ner_model.py

Training time on NVidia Tesla K80 is 110 seconds per epoch on CoNLL train set using characters embeddings and CRF.

Training Data

The training data must be in the following format (identical to the CoNLL2003 dataset).

A default test file is provided to help you getting started.

John B-PER
lives O
in O
New B-LOC
York I-LOC
. O

This O
is O
another O
sentence

Once you have produced your data files, change the parameters in config.py like

# dataset
dev_filename = "data/coNLL/eng/eng.testa.iob"
test_filename = "data/coNLL/eng/eng.testb.iob"
train_filename = "data/coNLL/eng/eng.train.iob"

License

This project is licensed under the terms of the apache 2.0 license (as Tensorflow and derivatives). If used for research, citation would be appreciated.

sequence_tagging's People

Contributors

Stargazers

Watchers

Forkers

vyraun fx-cc sushantkafle shannonyu sighsmile seungwookim vijay120 mehdimashayekhi demianzhang jdc08161063 aggounix theanhle wangxingjun778 zhemingyang mydeeplearning quantacenter yuchenlin dtsukiyama jiayingjie92 ml-ai-nlp-ir grafael sunyilun fskyml libcorner michaelwangtd soumyajyotibanerjee wsnooker blankxyz luhg gkumbhat vishalkakkar colinsongf zhangzhaoyin soon2soon jenniferzhu qsong4 joezhouwenxuan vino5211 maozhiqiang huyun-cs kellerxu yvespeirsman harshadeepg webcamel befeng maoh mengyx-work yzabc007 datar-ai diegow88 colinferguson linhx13 grainw rbshaffer karthikeyansam hanhaohh samithaj lanshuofeng fence rj7 hanyangliu dragon615 pluketic huoliangyu zhang-jian quyingqi languageandintelligence alekstk saksham-singhal fanglinchen catcatrun ttslr hexingwei zhianyang ntson2002 searchmodel fanbaoquan1025 mpyreddy williamgun007 zhaoqiuye monireh2 tsabata yutingliu qolina tjunlp ramakumar1729 ntson2002khtn crystalwlh shihuaxing lijiankou nempickaxe amanzinnov dinghe leoarruda lomberer hanksantford lipengfei-558 kormilitzin madisonjmyers wyxingyux

sequence_tagging's Issues

About dataset and pre-trained word embedding

Could you mind explaining what "trimmed_filename" is used for, and how I can create it (or it is created from the original file: glove_filename)
trimmed_filename = "data/glove.6B.{}d.trimmed.npz".format(dim)
Could you giving me the links to the dataset? I downloaded dataset from conll2003 and ran build_data.py, then got an error:
File "C:/Users/dragon/sequence_tagging-master/build_data.py", line 49, in <module> build_data(config) File "C:/Users/dragon/sequence_tagging-master/build_data.py", line 25, in build_data vocab_words, vocab_tags = get_vocabs([train, dev, test]) File "C:\Users\dragon\sequence_tagging-master\data_utils.py", line 87, in get_vocabs for words, tags in dataset: File "C:\Users\dragon\sequence_tagging-master\data_utils.py", line 55, in __iter__ word, tag = line.split(' ') ValueError: too many values to unpack (expected 2)
Thank you in advance!

providing sample trainig data

no sample / test data is provided..
it woul be usefult to have it as sample / test suite

regards
matteo

ValueError: not enough values to unpack (expected 2, got 0)

Have anyone a Idee why i am geeting this error during training...

  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
Epoch 1 out of 15

  1/703 [..............................] - ETA: 263s - train loss: 21.2996

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-4f283f836ad9> in <module>()
     39 
     40     # load, train, evaluate and interact with model
---> 41     main(config)

<ipython-input-2-4f283f836ad9> in main(config)
     29 
     30     # train, evaluate and interact
---> 31     model.train(train, dev, vocab_tags)
     32     model.evaluate(test, vocab_tags)
     33     model.interactive_shell(vocab_tags, processing_word)

~\ML\model.py in train(self, train, dev, tags)
    329                 self.logger.info("Epoch {:} out of {:}".format(epoch + 1, self.config.nepochs))
    330 
--> 331                 acc, f1 = self.run_epoch(sess, train, dev, tags, epoch)
    332 
    333                 # decay learning rate

~\ML\model.py in run_epoch(self, sess, train, dev, tags, epoch)
    259         prog = Progbar(target=nbatches)
    260         for i, (words, labels) in enumerate(minibatches(train, self.config.batch_size)):
--> 261             fd, _ = self.get_feed_dict(words, labels, self.config.lr, self.config.dropout)
    262 
    263             _, train_loss, summary = sess.run([self.train_op, self.loss, self.merged], feed_dict=fd)

~\ML\model.py in get_feed_dict(self, words, labels, lr, dropout)
     45     def get_feed_dict (self,words,labels=None, lr= None, dropout= None):
     46         if self.config.chars:
---> 47             char_ids, word_ids = zip(*words)
     48             word_ids, sequence_lengths = pad_sequences(word_ids, 0)
     49             char_ids, word_lengths = pad_sequences(char_ids, pad_tok=0, nlevels=2)
ValueError: not enough values to unpack (expected 2, got 0)

Thanks

Indices Error in CRF

Hi!
I dont know if you can help me, but Ill ask.
I created my own pipeline to generate data in the format that is needed by your network, without needing to save it in a file or something like that. Right now Im not using any word embeddings, and the network is training and loss is falling. Until at some point in the training an error comes up:

InvalidArgumentError (see above for traceback): indices[2,5] = 1716 is not in [0, 1681)
[[Node: crf/Gather_1 = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, validate_indices=true, _device="/job:localhost/replica:0/task:0/cpu:0"](crf/Reshape_3, crf/add_2)]]

Which happens at the tf.contrib.crf.crf_log_likelihood

Do you have any idea what this could be?

Initial value for word_embeddings

What would be a good value for initializer here:

sequence_tagging/model/ner_model.py

Lines 106 to 109 in 0048d60

 _word_embeddings = tf.get_variable( 

 name="_word_embeddings", 

 dtype=tf.float32, 

 shape=[self.config.nwords, self.config.dim_word])

I have seen a tf.random.uniform(...) being used: http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow , but not much more than that.

Tags confidence

It is possible to get the tag confidence or "likelihood" ?
I mean, in the sentence: I love Paris

word	tag	confidence
I	O	99%
love	O	75%
Paris	B-PLACE	89%

Resource Exhausted when reproducing CoNLL2003

Thank for this wonderful work. When I try to reproduce the result on CONLL2003 dataset on Tensorflow GPU instance I got this error on Char embedding + CRF layer

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[]
[[Node: train_step/beta1_power/Assign = Assign[T=DT_FLOAT, _class=["loc:@chars/_char_embeddings"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/gpu:0"](train_step/beta1_power, train_step/beta1_power/initial_value)]]

I've set batch size = 8, Tensorflow version '1.2.1'. What's wrong with that?

How could we control Random State?

Hi, Guillaume,

I am very confused about why the training result is quite random since it seems that there is no shuffling of the data in the code.

I tried to set tf.set_random_seed(42) and np.random.seed(42) at the beginning of model.py, but it does not work.

Thanks very much,
Bill

Unimplemented: TensorArray has size zero

When I produced my data files as txt files followed the format [Word] [Label] and store them in the right location, I got an Error indicates that

Unimplemented: TensorArray has size zero, but element shape is not fully defined. Currently only static shapes are supported when packing zero-size TensorArrays.

Does this problem happen to anyone?

truncate a long sentence in backpropagation

hi, I have some long sentences about hundreds of words. so I think it would slow and not efficient in backprop if I use the whole sentence as one sample in training. And if I truncate a sentence into a fixed length like 20, then I can propagate the final state of the last part to the next part of the same sentence as the initial state in the forward propagation. But here is the problem, how can I calculate the probability for the whole sentence in the CRF layer?

requirements.txt

Nice work. Can you post which packages you are using and what versions? Especially TensorFlow.
Thanks,
Jay

How much time it takes to train on CPU using default?

Hi,

Please let me know how much time it takes to train on CPU with default parameters.

Thanks,
Mahesh

Transfer Learning

I was wondering which layer is more appropriate to transfer learning, or even if it is possible in this model.

Any thoughts about that?

which line should I follow...

Hi @guillaumegenthial ! A really nice work thanks for sharing.
I am trying to run your code bei myself and understand it. My question is you use at build_data.py the config file but at the config pakage you say that you should run buil_data first. I am a little bit confused which line should I follow...
the next one is how you are coming at the data format .iob?
p.s : Sorry but i am new in python.
thanks in advance

Purpose and logic behind `nsteps`

sequence_tagging/model/ner_model.py

Line 179 in 0e4e6a7

nsteps = tf.shape(output)[1]

I don't think I completely understand the purpose and logic behind extracting the 1th dimension of the output. Help would be appreciated.

Logits dimensions

Hey,
I was looking at your model and I noticed that you have the logits coming out of your LSTM set to the number of classes (W.shape = [2*lstm, ntags]). Is there any reason to do so when you are using CRF to induce the tags?

portuguese NER

Hello, first I'd like to thank you for this implementation!
I'm trying to learn a NER for portuguese sentences but the biggest F1 score I got by now is 73. Which is strange to me since when I train the net in English (with a smaller dataset than the one in portuguese) with the same hyperparameters I almost immediately reach a 90 F1 score.
Do you have any tips or suggestion of what I might be doing wrong? I find it strange that a bigger dataset yields a poorer performance.

UnimplementedError: TensorArray has size zero, but element shape <unknown> is not fully defined. Currently only static shapes are supported when packing zero-size TensorArrays

Hi Guilla,

I tried to run your code on TF 1.2.0 and 1.4.0 but got the following error:

tensorflow.python.framework.errors_impl.UnimplementedError: TensorArray has size zero, but element shape is not fully defined. Currently only static shapes are supported when packing zero-size TensorArrays.
[[Node: train_step/gradients/chars/bidirectional_rnn/bw/bw/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3_grad/TensorArrayGatherV3 = TensorArrayGatherV3[dtype=DT_FLOAT, element_shape=, _device="/job:localhost/replica:0/task:0/gpu:0"](train_step/gradients/chars/bidirectional_rnn/bw/bw/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3_grad/TensorArrayGrad/TensorArrayGradV3, chars/bidirectional_rnn/bw/bw/TensorArrayUnstack/range, train_step/gradients/chars/bidirectional_rnn/bw/bw/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3_grad/TensorArrayGrad/gradient_flow)]]
[[Node: bi-lstm/bidirectional_rnn/fw/fw/stack/_147 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_3699_bi-lstm/bidirectional_rnn/fw/fw/stack", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"]]

Caused by op 'train_step/gradients/chars/bidirectional_rnn/bw/bw/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3_grad/TensorArrayGatherV3', defined at:
File "train.py", line 26, in
main()
File "train.py", line 12, in main
model.build()
File "C:\Users\miqian\Software\sequence_tagging\model\ner_model.py", line 227, in build
self.config.clip)
File "C:\Users\miqian\Software\sequence_tagging\model\base_model.py", line 58, in add_train_op
self.train_op = optimizer.minimize(loss)
File "C:\local\Anaconda3-4.2.0-Windows-x86_64\envs\tensorflow_gpu-1.2.0\lib\site-packages\tensorflow\python\training\optimizer.py", line 315, in minimize
grad_loss=grad_loss)
File "C:\local\Anaconda3-4.2.0-Windows-x86_64\envs\tensorflow_gpu-1.2.0\lib\site-packages\tensorflow\python\training\optimizer.py", line 386, in compute_gradients
colocate_gradients_with_ops=colocate_gradients_with_ops)
File "C:\local\Anaconda3-4.2.0-Windows-x86_64\envs\tensorflow_gpu-1.2.0\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 540, in gradients
grad_scope, op, func_call, lambda: grad_fn(op, *out_grads))
File "C:\local\Anaconda3-4.2.0-Windows-x86_64\envs\tensorflow_gpu-1.2.0\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 346, in _MaybeCompile
return grad_fn() # Exit early
File "C:\local\Anaconda3-4.2.0-Windows-x86_64\envs\tensorflow_gpu-1.2.0\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 540, in
grad_scope, op, func_call, lambda: grad_fn(op, *out_grads))
File "C:\local\Anaconda3-4.2.0-Windows-x86_64\envs\tensorflow_gpu-1.2.0\lib\site-packages\tensorflow\python\ops\tensor_array_grad.py", line 186, in _TensorArrayScatterGrad
grad = g.gather(indices)
File "C:\local\Anaconda3-4.2.0-Windows-x86_64\envs\tensorflow_gpu-1.2.0\lib\site-packages\tensorflow\python\ops\tensor_array_ops.py", line 360, in gather
element_shape=element_shape)
File "C:\local\Anaconda3-4.2.0-Windows-x86_64\envs\tensorflow_gpu-1.2.0\lib\site-packages\tensorflow\python\ops\gen_data_flow_ops.py", line 1814, in _tensor_array_gather_v3
element_shape=element_shape, name=name)
File "C:\local\Anaconda3-4.2.0-Windows-x86_64\envs\tensorflow_gpu-1.2.0\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 767, in apply_op
op_def=op_def)
File "C:\local\Anaconda3-4.2.0-Windows-x86_64\envs\tensorflow_gpu-1.2.0\lib\site-packages\tensorflow\python\framework\ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "C:\local\Anaconda3-4.2.0-Windows-x86_64\envs\tensorflow_gpu-1.2.0\lib\site-packages\tensorflow\python\framework\ops.py", line 1269, in init
self._traceback = _extract_stack()

...which was originally created as op 'chars/bidirectional_rnn/bw/bw/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3', defined at:
File "train.py", line 26, in
main()
[elided 0 identical lines from previous traceback]
File "train.py", line 12, in main
model.build()
File "C:\Users\miqian\Software\sequence_tagging\model\ner_model.py", line 220, in build
self.add_word_embeddings_op()
File "C:\Users\miqian\Software\sequence_tagging\model\ner_model.py", line 143, in add_word_embeddings_op
sequence_length=word_lengths, dtype=tf.float32)
File "C:\local\Anaconda3-4.2.0-Windows-x86_64\envs\tensorflow_gpu-1.2.0\lib\site-packages\tensorflow\python\ops\rnn.py", line 401, in bidirectional_dynamic_rnn
time_major=time_major, scope=bw_scope)
File "C:\local\Anaconda3-4.2.0-Windows-x86_64\envs\tensorflow_gpu-1.2.0\lib\site-packages\tensorflow\python\ops\rnn.py", line 574, in dynamic_rnn
dtype=dtype)
File "C:\local\Anaconda3-4.2.0-Windows-x86_64\envs\tensorflow_gpu-1.2.0\lib\site-packages\tensorflow\python\ops\rnn.py", line 688, in dynamic_rnn_loop
for ta, input in zip(input_ta, flat_input))
File "C:\local\Anaconda3-4.2.0-Windows-x86_64\envs\tensorflow_gpu-1.2.0\lib\site-packages\tensorflow\python\ops\rnn.py", line 688, in
for ta, input_ in zip(input_ta, flat_input))
File "C:\local\Anaconda3-4.2.0-Windows-x86_64\envs\tensorflow_gpu-1.2.0\lib\site-packages\tensorflow\python\util\tf_should_use.py", line 170, in wrapped
return _add_should_use_warning(fn(*args, **kwargs))
File "C:\local\Anaconda3-4.2.0-Windows-x86_64\envs\tensorflow_gpu-1.2.0\lib\site-packages\tensorflow\python\ops\tensor_array_ops.py", line 413, in unstack
indices=math_ops.range(0, num_elements), value=value, name=name)
File "C:\local\Anaconda3-4.2.0-Windows-x86_64\envs\tensorflow_gpu-1.2.0\lib\site-packages\tensorflow\python\util\tf_should_use.py", line 170, in wrapped
return _add_should_use_warning(fn(*args, **kwargs))
File "C:\local\Anaconda3-4.2.0-Windows-x86_64\envs\tensorflow_gpu-1.2.0\lib\site-packages\tensorflow\python\ops\tensor_array_ops.py", line 441, in scatter
name=name)
File "C:\local\Anaconda3-4.2.0-Windows-x86_64\envs\tensorflow_gpu-1.2.0\lib\site-packages\tensorflow\python\ops\gen_data_flow_ops.py", line 2062, in _tensor_array_scatter_v3
name=name)
File "C:\local\Anaconda3-4.2.0-Windows-x86_64\envs\tensorflow_gpu-1.2.0\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 767, in apply_op
op_def=op_def)
File "C:\local\Anaconda3-4.2.0-Windows-x86_64\envs\tensorflow_gpu-1.2.0\lib\site-packages\tensorflow\python\framework\ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)

UnimplementedError (see above for traceback): TensorArray has size zero, but element shape is not fully defined. Currently only static shapes are supported when packing zero-size TensorArrays.
[[Node: train_step/gradients/chars/bidirectional_rnn/bw/bw/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3_grad/TensorArrayGatherV3 = TensorArrayGatherV3[dtype=DT_FLOAT, element_shape=, _device="/job:localhost/replica:0/task:0/gpu:0"](train_step/gradients/chars/bidirectional_rnn/bw/bw/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3_grad/TensorArrayGrad/TensorArrayGradV3, chars/bidirectional_rnn/bw/bw/TensorArrayUnstack/range, train_step/gradients/chars/bidirectional_rnn/bw/bw/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3_grad/TensorArrayGrad/gradient_flow)]]
[[Node: bi-lstm/bidirectional_rnn/fw/fw/stack/_147 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_3699_bi-lstm/bidirectional_rnn/fw/fw/stack", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"]]

Config for CoNLL

Which config did you use to get the "state of the art" results on CoNLL dataset. I mean, how many epochs, dropout and so on...

Thanks in advance

matrix W

In the Decode stage,can you explain how to get the matrix W in theory?

Fine-tuning

Hi Guilla, I'm trying to fine-tuning a previous model that I trained. After I load the weights I get an error (shapes mismatch). I saw in the code that the shapes are [None,None ...].
Should I reshape ( Will I lose information doing that? )
I just need to retrain the last layer everything else will work as feature extractor.

I appreciate any help on that.

KeyError: '$UNK$ on python main.py

I got the following error: any suggestions?

Regards Matteo

Epoch 1 out of 15
Traceback (most recent call last):
File "main.py", line 46, in
main(config)
File "main.py", line 36, in main
model.train(train, dev, vocab_tags)
File "/home/matteo/src/sequence_tagging/model.py", line 354, in train
acc, f1 = self.run_epoch(sess, train, dev, tags, epoch)
File "/home/matteo/src/sequence_tagging/model.py", line 280, in run_epoch
nbatches = (len(train) + self.config.batch_size - 1) // self.config.batch_size
File "/home/matteo/src/sequence_tagging/data_utils.py", line 87, in len
for _ in self:
File "/home/matteo/src/sequence_tagging/data_utils.py", line 76, in iter
tag = self.processing_tag(tag)
File "/home/matteo/src/sequence_tagging/data_utils.py", line 247, in f
word = vocab_words[UNK]
KeyError: '$UNK$'

python3 build_data.py
Building vocab...

done. 22 tokens
Building vocab...
done. 400000 tokens
Writing vocab...
done. 22 tokens
Writing vocab...
done. 7 tokens
Writing vocab...
done. 23 tokens

matteo@debian:~/src/sequence_tagging$ cat data/words.txt
bicocca
$UNK$
eu
hangar
boycott
sui
british
giovanni
.
sulle
ess
german
italian
su
matteo
es
$NUM$
shop
it
to
luca

Add metrics to tensorboard

Hi @guillaumegenthial
I have trying to re-architect some of this code to add the metrics to tensorboard, but falling short because it needs "fresh" minibatched data to compute the metrics. Do you think it is possible to add this operation to the computation graph or most likely not possible.

sequence_tagging/model/ner_model.py

Line 303 in 0048d60

def run_evaluate(self, test):

Its getting killed but no error

I am trying to run main.py but its getting killed after sometime. I tried reduce glove file dimensions, hidden size neuron etc but its same.

Please suggest.

Thanks
Mahesh

The result seems strange in my experiments

Thanks for your codes and instructions!

I'm using the code (with no revision) to run some experiments on CoNLL2003 dataset (english), the F1 scores of testa and testb are about 91% and 87% which is not consistant with the reported 91% on the test set.

I have tried to optimize the hyper-parameterss but the F1 score can only reach 88.8% at most. I'm wondering if it could be due to the environment, like python (3.6.4), tensorflow version (tensorflow-gpu==1.3.0) or CUDA (8.0 with cudnn 5.1).

Could you provide your enviroment for comparison or give some insight about this result?

Thanks

NotFoundError (see above for traceback): Unsuccessful TensorSliceReader constructor

I ran "python build_data.py" on my dataset and it was Ok but when i run "python main.py" to train my model i get an error.
Here is the traceback:

C:\Users\New\Anaconda2\envs\python35\lib\site-packages\tensorflow\python\ops\gradients_impl.py:93: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
2017-07-25 13:01:13.770555: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations.
2017-07-25 13:01:13.780837: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-25 13:01:13.792198: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
Reloading the latest trained model...
INFO:tensorflow:Restoring parameters from /results/crf/model.weights/
Restoring parameters from /results/crf/model.weights/
2017-07-25 13:01:16.256308: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.256371: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.256429: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.256485: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.256525: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.256590: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.340877: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.284399: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.298523: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.312538: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.326592: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.270016: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.355930: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.370020: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.383945: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.397983: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.411831: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.425676: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.439582: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.453446: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.467343: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.481262: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.495121: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.509338: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.523517: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.537511: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.552983: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.566865: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.580697: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.594563: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.608358: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.622190: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.637740: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.651574: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.665625: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.679549: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.693420: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.707397: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

2017-07-25 13:01:16.722452: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\framework\op_kernel.cc:1158] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

Traceback (most recent call last):
File "C:\Users\New\Anaconda2\envs\python35\lib\site-packages\tensorflow\python\client\session.py", line 1139, in _do_call
return fn(*args)
File "C:\Users\New\Anaconda2\envs\python35\lib\site-packages\tensorflow\python\client\session.py", line 1121, in _run_fn
status, run_metadata)
File "C:\Users\New\Anaconda2\envs\python35\lib\contextlib.py", line 66, in exit
next(self.gen)
File "C:\Users\New\Anaconda2\envs\python35\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

     [[Node: save/RestoreV2_38 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_38/tensor_names, save/RestoreV2_38/shape_and_slices)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "main.py", line 38, in
model.train(train, dev, vocab_tags)
File "D:\Mashhadirajab\DeepNN-NER\sequence_tagging-master\sequence_tagging-master\model.py", line 340, in train
saver.restore(sess, self.config.model_output)
File "C:\Users\New\Anaconda2\envs\python35\lib\site-packages\tensorflow\python\training\saver.py", line 1548, in restore
{self.saver_def.filename_tensor_name: save_path})
File "C:\Users\New\Anaconda2\envs\python35\lib\site-packages\tensorflow\python\client\session.py", line 789, in run
run_metadata_ptr)
File "C:\Users\New\Anaconda2\envs\python35\lib\site-packages\tensorflow\python\client\session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "C:\Users\New\Anaconda2\envs\python35\lib\site-packages\tensorflow\python\client\session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "C:\Users\New\Anaconda2\envs\python35\lib\site-packages\tensorflow\python\client\session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

     [[Node: save/RestoreV2_38 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_38/tensor_names, save/RestoreV2_38/shape_and_slices)]]

Caused by op 'save/RestoreV2_38', defined at:
File "main.py", line 38, in
model.train(train, dev, vocab_tags)
File "D:\Mashhadirajab\DeepNN-NER\sequence_tagging-master\sequence_tagging-master\model.py", line 333, in train
saver = tf.train.Saver()
File "C:\Users\New\Anaconda2\envs\python35\lib\site-packages\tensorflow\python\training\saver.py", line 1139, in init
self.build()
File "C:\Users\New\Anaconda2\envs\python35\lib\site-packages\tensorflow\python\training\saver.py", line 1170, in build
restore_sequentially=self._restore_sequentially)
File "C:\Users\New\Anaconda2\envs\python35\lib\site-packages\tensorflow\python\training\saver.py", line 691, in build
restore_sequentially, reshape)
File "C:\Users\New\Anaconda2\envs\python35\lib\site-packages\tensorflow\python\training\saver.py", line 407, in _AddRestoreOps
tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
File "C:\Users\New\Anaconda2\envs\python35\lib\site-packages\tensorflow\python\training\saver.py", line 247, in restore_op
[spec.tensor.dtype])[0])
File "C:\Users\New\Anaconda2\envs\python35\lib\site-packages\tensorflow\python\ops\gen_io_ops.py", line 640, in restore_v2
dtypes=dtypes, name=name)
File "C:\Users\New\Anaconda2\envs\python35\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 767, in apply_op
op_def=op_def)
File "C:\Users\New\Anaconda2\envs\python35\lib\site-packages\tensorflow\python\framework\ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "C:\Users\New\Anaconda2\envs\python35\lib\site-packages\tensorflow\python\framework\ops.py", line 1269, in init
self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Unsuccessful TensorSliceReader constructor: Failed to get matching files on /results/crf/model.weights/: Not found: FindFirstFile failed for: /results/crf/model.weights : The system cannot find the path specified.

     [[Node: save/RestoreV2_38 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_38/tensor_names, save/RestoreV2_38/shape_and_slices)]]

Model behaving different on different machine

Hi!

I used the program and ran all the necessary steps on a machine with access to a beefy GPU. With test results around acc: 97% and f1 score: 91% I found it to be great. However, when trying to further implement other stuff, according to my needs, on a computer without the GPU required I figured I'd just transfer the model.weights folder along with chars.txt, tags.txt and words.txt from the GPU computer to this one. What I first noticed when i ran evaluate.py is that I got a error saying that lhs shape did not match rhs shape, which I 'fixed' by running build_data.py (which might or might not be part of the issue).

When running evaluate.py I'm getting the same accuracy and f1 scores as I would running the test.txt as both train and test set.

What's bothering me is that the save / reload works perfectly on the GPU computer but not at all on my computer.

GPU comp runs tensorflow-gpu and python version 3.6.1 while my computer runs 'regular' tensorflow with python version 3.6.5. Could the difference in versions be the source of the problem?

Regards, Lukas

LSTM: number of hidden layers not known

You did not specify the number of lstm layers. Instead there is hidden_size_char = 100 and hidden_size_lstm = 300. How we do know about the number of layers?

How do we train this tagger on another dataset?

And can we use a word2vec-pre-trained word vectors?

Concatenate state WARNING Tensorflow 1.3.0

After upgrading to Tensorflow 1.3.0 I get the warning:

WARNING:tensorflow:<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7f05daf59f90>: Using a concatenated state is slower and will soon be deprecated.  Use state_is_tupl
e=True.
<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7f05daf59f90>: Using a concatenated state is slower and will soon be deprecated.  Use state_is_tuple=True.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7f05c4657b50>: Using a concatenated state is slower and will soon be deprecated.  Use state_is_tupl
e=True.
<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7f05c4657b50>: Using a concatenated state is slower and will soon be deprecated.  Use state_is_tuple=True.
/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gradients_impl.py:95: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may con
sume a large amount of memory.

Everything works ok, but maybe it will not work in the future.

`sequence_length` in `pad_sequences` for `nlevels=2`

I noticed that the sequence_length is not collected properly in the pad_sequence function:
https://github.com/guillaumegenthial/sequence_tagging/blob/master/model/data_utils.py#L335

Shouldn't it be:

_, sequence_length = _pad_sequences(sequence_length, 0,  max_length_sentence)

I am slightly surprised that this hasn't caused any issues so far.

Op for prediction in CRF case

Does it make sense to create an Op for the prediction part of the CRF model using something like tf.py_func? Would be nice to monitor accuracy during training. Or would that make training insanely slow?

GPU memory usage: resourceExhaustedError

Hello Guillaumegenthial,

Thank you sharing your code and post. I have tried your program, while training the model on the 'test.txt' file which is present by default. I see that my GPU memory is completely used. I tried glove embeddings of lesser dimensions (50) too and observe the same.

My setup:
Python : 3.5
Cuda : 9.0
Tensorflow : 1.8
GPU : Tesla K-80 ( it has 11.5 GB memory)
OS : redhat

I tried with cuda-8.0 and Tensorflow : 1.2.0 as well and I observe the same.

Also when I tried to run on my custom input which is a around 100MB in size (with 2 entities labelled, apart from non-entity label) I get the following error:
resourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[108092,100] and type float on /job:localhost/replica:0/task: 0/device:GPU:0 by allocator GPU_0_bfc [[Node: chars/bidirectional_rnn/fw/fw/while/lstm_cell/split = Split[T=DT_FLOAT, num_split=4, _device="/job:localhost/replica:0/task:0/devi ce:GPU:0"](train_step/gradients/Add_3/y, chars/bidirectional_rnn/fw/fw/while/lstm_cell/BiasAdd)]]

Has anyone faced this error before ?
Also do you have any insight on why it takes around 10GB of GPU memory while training on a very small dataset ?

regards,
goutham

padding 0s to the given data

0 is chosen to pad to the end of training sentences as well as training labels in the code.

word_ids, sequence_lengths = pad_sequences(words, 0)
labels, _ = pad_sequences(labels, 0)

However, I suspect that "0" is a token id for a specific word or label rather than a general "end of speech" token.

EDIT:
Now, I realize that
either

#crf
log_likelihood, self.transition_params = tf.contrib.crf.crf_log_likelihood(
            self.logits, self.labels, self.sequence_lengths)
            self.loss = tf.reduce_mean(-log_likelihood)

#softmax
losses = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=self.logits, labels=self.labels)
            mask = tf.sequence_mask(self.sequence_lengths)
            losses = tf.boolean_mask(losses, mask)
            self.loss = tf.reduce_mean(losses)

automatically gets rid of the trailing parts and It does not matter which token is padded to the end of the sequence.

Where is the data set?

Lack of test data sets, with the thesaurus, can provide a small sample of data, the code can be normal operation.

ValueError: Variable chars/_char_embeddings already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at ner_model

line 126, in add_word_embeddings_op
shape=[self.config.nchars, self.config.dim_char])

I've tried the below, unsuccesfully:
with tf.variable_scope("chars") as scope:
scope.reuse_variables()

F1 score is acc 0nan - f1 0.00 on test data

why should test data in .iob format? it should be sentences or tokens instead of tagged data.

TypeError: data type not understood

Have someone a idee why i am getting this error when runing build_data......

Building vocab...
-done.33970tokens
Building vocab...
-done.400000 tokens
Writing vocab..
-done.7398 tokens
Writing vocab...
-done.28227 tokens

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-46-c239750804c9> in <module>()
     40 
     41 if __name__ == "__main__":
---> 42     main()

<ipython-input-46-c239750804c9> in main()
     31     vocab = load_vocab(config.filename_words)
     32     export_trimmed_glove_vectors(vocab, config.filename_glove,
---> 33                                 config.filename_trimmed, config.dim_word)
     34 
     35     # Build and save char vocab

~\ML\data_utils.py in export_trimmed_glove_vectors(vocab, filename_glove, filename_trimmed, dim)
    157 def export_trimmed_glove_vectors(vocab, filename_glove, filename_trimmed, dim):
    158 
--> 159     embeddings = np.zeros([len(vocab),dim])
    160     with open(filename_glove,"r",encoding="utf-8") as f:
    161         for line in f:

TypeError: data type not understood

acc, F1 problem

acc 28.23 - f1 9.78

new best score!
Epoch 2 out of 15

the same data, use pytorch , F1 get 90+

B flag 2W
I flag 6W
O flag 100W

AttributeError: 'Config' object has no attribute 'logger'

Hi everyone!
I am geeting a new error maybe someone has the same error and can help me....

AttributeError                            Traceback (most recent call last)
<ipython-input-2-c8550ce6081b> in <module>()
     11 
     12 if __name__ =="__main__":
---> 13     main()

<ipython-input-2-c8550ce6081b> in main()
      2 
      3     config = Config()
----> 4     model = NERModel(config)
      5     model.build()
      6 

~\ML\ner_model.py in __init__(self, config)
     23 class NERModel(BaseModel):
     24     def __init__(self,config):
---> 25         super().__init__(config)
     26         self.idx_to_tag ={idx:tag for tag,idx in 
     27                           self.config.vocab_tags.items()}

~\ML\base_model.py in __init__(self, config)
     22         """
     23         self.config = config
---> 24         logger = config.logger
     25         self.sess   = None
     26         self.saver  = None

AttributeError: 'Config' object has no attribute 'logger'

thanks

just a stupid error ... I close the issie

99.01 % accuracy

Epoch 15 out of 15
281/281 [==============================] - 350s - train loss: 0.1390
acc 99.91 - f1 99.32

new best score!
I am getting f1 99.32, which is not right with same configuration.

Learn rate issuse?

when I use lr = 0.001 in my model, there is no problem ; but lr = 0.01 or 0.1 , the train loss decrease first and then increase . What is wrong？
2018-03-23 15:29:33,355:INFO: train loss:1.10886
2018-03-23 15:31:41,870:INFO: train loss:1.53702
2018-03-23 15:33:53,753:INFO: train loss:0.866067
2018-03-23 15:36:03,349:INFO: train loss:1.32708
2018-03-23 15:38:15,538:INFO: train loss:1.20029
2018-03-23 15:40:22,720:INFO: train loss:nan
2018-03-23 15:42:27,061:INFO: train loss:nan
2018-03-23 15:44:38,513:INFO: train loss:nan
2018-03-23 15:46:45,673:INFO: train loss:nan
2018-03-23 15:48:54,684:INFO: train loss:nan
133 2018-03-23 16:04:34,913:INFO: acc 0.74 - f1 0.02

where is the tagged test file

Hey,

when you execute python evaluate.py, I was wondering if a tagged test file was generated in the same format of the training dataset. like not only getting the accuracy but also the test file with the tagged results.

how to run this code on GPU?

thanks for sharing this code .
I reading your code and your blog_post but this code is run in the CPU .
how to run this code on GPU?

No generalization while inferring

Ok, Ive got another issue, which I am not sure how to solve.
So everything works so far with my own pipeline and dataset but I ran into a strange problem, and I dont know exactly what could be wrong.
Basically, when I send something to the Model to create tags for, it returns them without problem. I'm running one instance of the original model (still, on my dataset) and one model with my pipeline.

The strange thing that happens is when I try to predict a sequence it has not learned.
For example: Show me a red dress.

Neither color nor dress are within the dataset, but when I run this example on the original model, it receives (more or less) the correct entities. Which is good.
If I run this example on my pipeline, it only receives O tags.
Which is just really strange behaviour. Do you have any idea what or why it could be?

Training Speed Problem

Hi, Guillaume

Thanks for the great code! Learned a lot from it.
I have a problem of training speed: my training time is extremely slower than yours (100s/epoch).
Mine is about 7000s/epoch (I am using conll2003 data as well).

My GPUs are two Titan X and my cuda is 8.0 with tensorflow 1.2 under python 2.7.12.
The training output is like this:

yuchen@statnlp1:~/uda4nerism/tagger_tf$ python2 main.py
/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gradients_impl.py:93: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
2017-07-13 18:20:31.128744: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-13 18:20:31.128783: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-13 18:20:31.128805: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-07-13 18:20:31.128810: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-13 18:20:31.128814: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-07-13 18:20:31.714116: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties:
name: GeForce GTX TITAN X
major: 5 minor: 2 memoryClockRate (GHz) 1.076
pciBusID 0000:02:00.0
Total memory: 11.92GiB
Free memory: 6.80GiB
2017-07-13 18:20:31.714159: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0
2017-07-13 18:20:31.714168: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   Y
2017-07-13 18:20:31.714179: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:02:00.0)
Epoch 1 out of 15
2017-07-13 18:20:42.810724: I tensorflow/core/common_runtime/gpu/pool_allocator.cc:247] PoolAllocator: After 4698 get requests, put_count=3185 evicted_count=1000 eviction_rate=0.313972 and unsatisfied allocation rate=0.556194
2017-07-13 18:20:42.810783: I tensorflow/core/common_runtime/gpu/pool_allocator.cc:259] Raising pool_size_limit_ from 100 to 110
  5/703 [..............................] - ETA: 6703s - train loss: 20.18322017-07-13 18:21:28.389388: I tensorflow/core/common_runtime/gpu/pool_allocator.cc:247] PoolAllocator: After 4353 get requests, put_count=3033 evicted_count=1000 eviction_rate=0.329707 and unsatisfied allocation rate=0.53779
2017-07-13 18:21:28.389451: I tensorflow/core/common_runtime/gpu/pool_allocator.cc:259] Raising pool_size_limit_ from 233 to 256
  9/703 [..............................] - ETA: 7439s - train loss: 21.49322017-07-13 18:22:21.533471: I tensorflow/core/common_runtime/gpu/pool_allocator.cc:247] PoolAllocator: After 372 get requests, put_count=1412 evicted_count=1000 eviction_rate=0.708215 and unsatisfied allocation rate=0
 16/703 [..............................] - ETA: 6967s - train loss: 16.23072017-07-13 18:23:21.660380: I tensorflow/core/common_runtime/gpu/pool_allocator.cc:247] PoolAllocator: After 2680 get requests, put_count=3245 evicted_count=1000 eviction_rate=0.308166 and unsatisfied allocation rate=0.194776
2017-07-13 18:23:21.660418: I tensorflow/core/common_runtime/gpu/pool_allocator.cc:259] Raising pool_size_limit_ from 958 to 1053
 46/703 [>.............................] - ETA: 5589s - train loss: 12.33602017-07-13 18:27:11.857989: I tensorflow/core/common_runtime/gpu/pool_allocator.cc:247] PoolAllocator: After 17151 get requests, put_count=17296 evicted_count=1000 eviction_rate=0.0578168 and unsatisfied allocation rate=0.0617457
2017-07-13 18:27:11.858085: I tensorflow/core/common_runtime/gpu/pool_allocator.cc:259] Raising pool_size_limit_ from 2253 to 2478
160/703 [=====>........................] - ETA: 5039s - train loss: 7.1027

Is it a problem to use the same lstm cell for bidirectional_dynamic_rnn ?

Bidirectional RNN use different cell to convey information for each direction, should it be more appropriate to define different lstm cell for that purpose ?

Understanding the `reshape`

sequence_tagging/model/ner_model.py

Lines 132 to 133 in 0e4e6a7

 char_embeddings = tf.reshape(char_embeddings, 

 shape=[s[0]*s[1], s[-2], self.config.dim_char])

@guillaumegenthial Can you please explain the reshape dimensions?

UnicodeDecodeError

Hi Guillaume,

I'm simply running the code as is. Is it something to do with the python version ?

class IncrementalDecoder(codecs.IncrementalDecoder):
25 def decode(self, input, final=False):
---> 26 return codecs.ascii_decode(input, self.errors)[0]
27
28 class StreamWriter(Codec,codecs.StreamWriter):

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 123: ordinal not in range(128)

Some advice on model performance

So, I trained new model which can identify new "TECH" entity along with all other entities found in CoNLL dataset. New dataset format and distribution across train/dev/test is same as CoNLL as I am combining CoNLL data with my new data, just so that I get balance entities counts. After training model I am getting following results, which make sense!

Testing model over test set
acc 98.89 - f1 93.19

About my new dataset, I can see that most of the tagged sentences which has "TECH" entity tagged to a word, has remaining words are mostly like "O" (I used spaCy to create this dataset). So, as you can see train, dev, test data follows the same pattern. Example:

IBM B-TECH
will O
perhaps O
always O
conjure O
up O
positive O
associations O
such O
is O
the O
history O
and O
once O
global O
hegemony O
. O

Things get spooky when I try to predict new sentences with this new model:

sen = 'Chicago, on Lake Michigan in Illinois, is among the largest cities in the U.S.'
print nr.predict_sentence(sen)

[('Chicago,', 'O'), ('on', 'O'), ('Lake', 'B-LOC'), ('Michigan', 'I-LOC'), ('in', 'O'),
 ('Illinois,', 'B-LOC'), ('is', 'O'), ('among', 'O'), ('the', 'O'), ('largest', 'O'), ('cities', 'O'),
 ('in', 'O'), ('the', 'O'), ('U.S.', 'B-LOC')]

sen2 = 'Chicago, on Lake Michigan in Illinois, is among the largest cities in the U.S. which has many Apple stores'
print nr.predict_sentence(sen2)

[('Chicago,', 'O'), ('on', 'O'), ('Lake', 'O'), ('Michigan', 'O'), ('in', 'O'),
 ('Illinois,', 'O'), ('is', 'O'), ('among', 'O'), ('the', 'O'), ('largest', 'O'), ('cities', 'O'),
 ('in', 'O'), ('the', 'O'), ('U.S.', 'O'), ('which', 'O'), ('has', 'O'), ('many', 'O'), ('Apple', 'B-TECH'), ('stores', 'O')]

As we can see, model can predict CoNLL entities really accurately as long as there is no "TECH" in sentence, if any word related to "TECH" found in sentence, remaining all entities are not recognized. I think this behaviour is mainly because of the dataset.

Any suggestion or thoughts for this model behaviour or how to tune the performance is welcome!

Thanks!

	_word_embeddings = tf.get_variable(
	name="_word_embeddings",
	dtype=tf.float32,
	shape=[self.config.nwords, self.config.dim_word])

	char_embeddings = tf.reshape(char_embeddings,
	shape=[s[0]*s[1], s[-2], self.config.dim_char])

guillaumegenthial / sequence_tagging Goto Github PK

sequence_tagging's Introduction

Named Entity Recognition with Tensorflow

Task

Model

Getting started

Details

Training Data

License

sequence_tagging's People

Contributors

Stargazers

Watchers

Forkers

sequence_tagging's Issues

Recommend Projects

Recommend Topics

Recommend Org