Git Product home page Git Product logo

dynet-biaffine-dependency-parser's People

Contributors

jcyk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

dynet-biaffine-dependency-parser's Issues

The program runs too long

When i ran your code 5 day ago, it printed some information as below. Nothing more since then, no models were created, but the program still runs in my computer. Do you know why? (Dataset: Train set - 4500 sentences, Development set - 1100 sentences)

[dynet] random seed: 969247908
[dynet] allocating memory: 512MB
[dynet] memory allocation done.
Loaded config file sucessfully.
pretrained_embeddings_file ../data/emb/vi.txt
data_dir ../data/treebank
train_file ../data/treebank/train.conllu
dev_file ../data/treebank/dev.conllu
test_file ../data/treebank/test.conllu
min_occur_count 2
save_dir ../ckpt/default
config_file ../ckpt/default/config.cfg
save_model_path ../ckpt/default/model
save_vocab_path ../ckpt/default/vocab
load_dir ../ckpt/default
load_model_path ../ckpt/default/model
load_vocab_path ../ckpt/default/vocab
lstm_layers 3
word_dims 100
tag_dims 100
dropout_emb 0.33
lstm_hiddens 400
dropout_lstm_input 0.33
dropout_lstm_hidden 0.33
mlp_arc_size 500
mlp_rel_size 100
dropout_mlp 0.33
learning_rate 2e-3
decay .75
decay_steps 5000
beta_1 .9
beta_2 .9
epsilon 1e-12
num_buckets_train 40
num_buckets_valid 10
num_buckets_test 10
train_iters 50000
train_batch_size 5000
test_batch_size 5000
validate_every 100
save_after 5000
#words in training set: 3544
Vocab info: #words 10936, #tags 28 #rels 33
(400, 600)
Orthogonal pretrainer loss: 5.20e-27
(400, 600)
Orthogonal pretrainer loss: 7.02e-27
(400, 1200)
Orthogonal pretrainer loss: 2.79e-30
(400, 1200)
Orthogonal pretrainer loss: 2.77e-30
(400, 1200)
Orthogonal pretrainer loss: 2.82e-30
(400, 1200)
Orthogonal pretrainer loss: 2.93e-30
(600, 800)
Orthogonal pretrainer loss: 3.90e-23

training data

hello, where is your training data from, conll-2012?

The parsed results in not a tree even ensure_tree = True

I rad the code about the decoding process of the parser, mainly the arc_argmax in lib/utils.py and tarjan.py, but I thought there is only guarantees that:

  1. only one element is the child of ,
  2. acyclic
  3. single head for every node except for the .

But, there is no guarantee for the result to be connected. I have a piece of code for testing:

import numpy as np
import dynet as dy

from lib import arc_argmax

m = np.random.randn(10, 10)
mt = dy.inputTensor(m)
mtp = dy.softmax(mt, d=1)
probs = mtp.npvalue()

mask = np.array([1,1,1,1,1,1,1,1,0,0]) # so only 7 words is valid. if includes the <ROOT> , then 8
sent_len = np.sum(mask) # 8

heads = arc_argmax(probs, sent_len, mask)
dependents = range(1, sent_len)
for d, h in zip(dependents, heads[1: sent_len]):
	print('{0} --> {1}'.format(d, h))

Though I use randomly initialized m as the logits, but trying multiple times will produce some result like this.

1 --> 3
2 --> 5
3 --> 2
4 --> 1
5 --> 0
6 --> 5
7 --> 3

The graph is not connected. What's the problem, did I make something wrong? Thank you!

Sorry, I made a mistake about the offset ( is included).

Where is module ConfigParser?

Thanks for this nice reimplement of Biaffine parser.
One question: where is the module file 'ConfigParser.py', it is required at line 1 of config.py:
from ConfigParser import SafeConfigParser

Question about the factor of compensation in dropout

scale = 3. / (2.*word_mask + tag_mask + 1e-12)

I guess this operation is the simplification of the source code, right?
But I think it is not strictly equivalent to what is described in the paper:

We drop 33% of words and 33% of tags during training: when one is dropped the other is scaled by a factor of two to compensate, and when both are dropped together, the model simply gets an input of zeros.

I just modified it and it seems that the following one is better:

scale = 2. / (word_mask + tag_mask + 1e-12) 

Am I right?

training time

So good job, I am curious about the time needed for training. It looks a little slow.

Orthogonal pretrainer loss: 2.88e-30 (400, 1200) Orthogonal pretrainer loss: 2.82e-30 (400, 1200) Orthogonal pretrainer loss: 2.78e-30 (600, 800) Orthogonal pretrainer loss: 3.78e-23 2018-05-25 19:04:29 Start training epoch #0 Step #79: Acc: arc 0.59, rel 0.81, overall 0.51, loss 0.980

best_UAS problem

In train.py, I found the best_UAS is always 0 because this best_UAS doesn't update even UAS > best_UAS

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.