jcyk / dynet-biaffine-dependency-parser Goto Github PK

View Code? Open in Web Editor NEW

33.0 6.0 7.0 4.74 MB

Dynet-based Biaffine Parser

Shell 0.23% Python 45.55% Perl 54.22%

dynet dependency-parser biaffine

dynet-biaffine-dependency-parser's People

Contributors

Stargazers

Watchers

Forkers

rasoolims wangyuxuan93 fengziyjun iclementine rootofmylife airysen hicham-azer

dynet-biaffine-dependency-parser's Issues

The program runs too long

When i ran your code 5 day ago, it printed some information as below. Nothing more since then, no models were created, but the program still runs in my computer. Do you know why? (Dataset: Train set - 4500 sentences, Development set - 1100 sentences)

[dynet] random seed: 969247908
[dynet] allocating memory: 512MB
[dynet] memory allocation done.
Loaded config file sucessfully.
pretrained_embeddings_file ../data/emb/vi.txt
data_dir ../data/treebank
train_file ../data/treebank/train.conllu
dev_file ../data/treebank/dev.conllu
test_file ../data/treebank/test.conllu
min_occur_count 2
save_dir ../ckpt/default
config_file ../ckpt/default/config.cfg
save_model_path ../ckpt/default/model
save_vocab_path ../ckpt/default/vocab
load_dir ../ckpt/default
load_model_path ../ckpt/default/model
load_vocab_path ../ckpt/default/vocab
lstm_layers 3
word_dims 100
tag_dims 100
dropout_emb 0.33
lstm_hiddens 400
dropout_lstm_input 0.33
dropout_lstm_hidden 0.33
mlp_arc_size 500
mlp_rel_size 100
dropout_mlp 0.33
learning_rate 2e-3
decay .75
decay_steps 5000
beta_1 .9
beta_2 .9
epsilon 1e-12
num_buckets_train 40
num_buckets_valid 10
num_buckets_test 10
train_iters 50000
train_batch_size 5000
test_batch_size 5000
validate_every 100
save_after 5000
#words in training set: 3544
Vocab info: #words 10936, #tags 28 #rels 33
(400, 600)
Orthogonal pretrainer loss: 5.20e-27
(400, 600)
Orthogonal pretrainer loss: 7.02e-27
(400, 1200)
Orthogonal pretrainer loss: 2.79e-30
(400, 1200)
Orthogonal pretrainer loss: 2.77e-30
(400, 1200)
Orthogonal pretrainer loss: 2.82e-30
(400, 1200)
Orthogonal pretrainer loss: 2.93e-30
(600, 800)
Orthogonal pretrainer loss: 3.90e-23

Why halved the batch size?

Hi @jcyk ,

Thanks for your nice implementation, much cleaner than the TF codes, even produced higher scores.
On PTB-SD, better scores were reported:

This repo: LAS 95.01, UAS 96.05
TF repo:    LAS 94.81, UAS 95.89

I guess the reason is you halved the batch size:

Dynet-Biaffine-dependency-parser/run/train.py

Line 41 in 903819b

num = int(words.shape[1]/2)

Is my understanding correct?
Thanks.

training data

hello, where is your training data from, conll-2012?

The parsed results in not a tree even ensure_tree = True

I rad the code about the decoding process of the parser, mainly the arc_argmax in lib/utils.py and tarjan.py, but I thought there is only guarantees that:

only one element is the child of ,
acyclic
single head for every node except for the .

But, there is no guarantee for the result to be connected. I have a piece of code for testing:

import numpy as np
import dynet as dy

from lib import arc_argmax

m = np.random.randn(10, 10)
mt = dy.inputTensor(m)
mtp = dy.softmax(mt, d=1)
probs = mtp.npvalue()

mask = np.array([1,1,1,1,1,1,1,1,0,0]) # so only 7 words is valid. if includes the <ROOT> , then 8
sent_len = np.sum(mask) # 8

heads = arc_argmax(probs, sent_len, mask)
dependents = range(1, sent_len)
for d, h in zip(dependents, heads[1: sent_len]):
	print('{0} --> {1}'.format(d, h))

Though I use randomly initialized m as the logits, but trying multiple times will produce some result like this.

1 --> 3
2 --> 5
3 --> 2
4 --> 1
5 --> 0
6 --> 5
7 --> 3

~~The graph is not connected. What's the problem, did I make something wrong? Thank you!~~

Sorry, I made a mistake about the offset ( is included).

Where is module ConfigParser?

Thanks for this nice reimplement of Biaffine parser.
One question: where is the module file 'ConfigParser.py', it is required at line 1 of config.py:
from ConfigParser import SafeConfigParser

I run your code both on PTB and CTB dataset, but in CTB data LAS could not achieve the performence in paper

Could you offer the LAS result of PTB and CTB. thx

iterations of training

Question about the factor of compensation in dropout

Dynet-Biaffine-dependency-parser/models/baseParser.py

Line 59 in 50b287b

scale = 3. / (2.*word_mask + tag_mask + 1e-12)

I guess this operation is the simplification of the source code, right?
But I think it is not strictly equivalent to what is described in the paper:

We drop 33% of words and 33% of tags during training: when one is dropped the other is scaled by a factor of two to compensate, and when both are dropped together, the model simply gets an input of zeros.

I just modified it and it seems that the following one is better:

scale = 2. / (word_mask + tag_mask + 1e-12)

Am I right?

training time

So good job, I am curious about the time needed for training. It looks a little slow.

Orthogonal pretrainer loss: 2.88e-30 (400, 1200) Orthogonal pretrainer loss: 2.82e-30 (400, 1200) Orthogonal pretrainer loss: 2.78e-30 (600, 800) Orthogonal pretrainer loss: 3.78e-23 2018-05-25 19:04:29 Start training epoch #0 Step #79: Acc: arc 0.59, rel 0.81, overall 0.51, loss 0.980

best_UAS problem

In train.py, I found the best_UAS is always 0 because this best_UAS doesn't update even UAS > best_UAS

jcyk / dynet-biaffine-dependency-parser Goto Github PK

dynet-biaffine-dependency-parser's People

Contributors

Stargazers

Watchers

Forkers

dynet-biaffine-dependency-parser's Issues

The program runs too long

Why halved the batch size?

training data

The parsed results in not a tree even ensure_tree = True

Where is module ConfigParser?

I run your code both on PTB and CTB dataset, but in CTB data LAS could not achieve the performence in paper

iterations of training

Question about the factor of compensation in dropout

training time

best_UAS problem

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent