Git Product home page Git Product logo

Comments (17)

XuezheMax avatar XuezheMax commented on August 17, 2024 1

from neuronlp2.

XuezheMax avatar XuezheMax commented on August 17, 2024 1

Hi Manni,

Please try again to see if it works now.
Thanks.

from neuronlp2.

ManniSingh avatar ManniSingh commented on August 17, 2024

Thanks Max, i will wait for it.
Though i have a Tensorflow version but for some reason doesn't give F1: 91+ on same params.
Lasagne version is also so slow.

from neuronlp2.

ManniSingh avatar ManniSingh commented on August 17, 2024

Thanks it Works,
However, not giving me anywhere near f1: 90+ on conll2003. I tried both of your codes.

This one gives:
dev acc: 97.91%, precision: 90.59%, recall: 86.55%, F1: 88.52%
best dev acc: 98.01%, precision: 89.95%, recall: 87.31%, F1: 88.61% (epoch: 14)
best test acc: 96.40%, precision: 82.97%, recall: 80.67%, F1: 81.80% (epoch: 14)

Without any early stop:
Epoch 100 (LSTM(std), learning rate=0.0025, decay rate=0.0500 (1)):
loss: 0.0423, time: 133.20s
dev acc: 97.98%, precision: 89.70%, recall: 87.45%, F1: 88.56%
best dev acc: 98.03%, precision: 90.04%, recall: 87.89%, F1: 88.95% (epoch: 69)
best test acc: 96.27%, precision: 82.60%, recall: 80.74%, F1: 81.66% (epoch: 69)

Lasagne code gives:
best dev acc: 97.99%, precision: 89.74%, recall: 87.73%, F1: 88.72%
best test acc: 96.21%, precision: 81.13%, recall: 79.78%, F1: 80.45%

Under settings : (mentioned in your paper),
following are logs from the code:

loading embedding: glove from /data/manni/ner/glove/glove.6B.100d.txt.gz
2017-11-18 13:02:00,893 - NERCRF - INFO - Creating Alphabets
2017-11-18 13:02:00,911 - Create Alphabets - INFO - Word Alphabet Size (Singleton): 11985 (0)
2017-11-18 13:02:00,913 - Create Alphabets - INFO - Character Alphabet Size: 86
2017-11-18 13:02:00,915 - Create Alphabets - INFO - POS Alphabet Size: 47
2017-11-18 13:02:00,916 - Create Alphabets - INFO - Chunk Alphabet Size: 19
2017-11-18 13:02:00,918 - Create Alphabets - INFO - NER Alphabet Size: 9
2017-11-18 13:02:00,919 - NERCRF - INFO - Word Alphabet Size: 11985
2017-11-18 13:02:00,921 - NERCRF - INFO - Character Alphabet Size: 86
2017-11-18 13:02:00,923 - NERCRF - INFO - POS Alphabet Size: 47
2017-11-18 13:02:00,925 - NERCRF - INFO - Chunk Alphabet Size: 19
2017-11-18 13:02:00,926 - NERCRF - INFO - NER Alphabet Size: 9
2017-11-18 13:02:00,928 - NERCRF - INFO - Reading Data
Reading data from /data/manni/ner/conll2003/eng.train
reading data: 10000
Total number of data: 14041
Reading data from /data/manni/ner/conll2003/eng.testa
Total number of data: 3250
Reading data from /data/manni/ner/conll2003/eng.testb
Total number of data: 3453
oov: 11984
2017-11-18 13:02:12,675 - NERCRF - INFO - constructing network...
2017-11-18 13:02:13,287 - NERCRF - INFO - Network: LSTM, num_layer=1, hidden=200, filter=30, tag_space=128, crf=bigram
2017-11-18 13:02:13,289 - NERCRF - INFO - training: l2: 0.000000, (#training data: 14041, batch: 10, dropout: 0.50)

settings:
tag_space = 128 # WHAT IS THIS?
dropout = 'std'
logger = get_logger("NERCRF")
mode = 'LSTM'
train_path = "/data/manni/ner/conll2003/eng.train"
dev_path = "/data/manni/ner/conll2003/eng.testa"
test_path = "/data/manni/ner/conll2003/eng.testb"
num_epochs = 100
batch_size = 10
hidden_size = 200
num_filters = 30
learning_rate = 0.015
momentum = 0.9
decay_rate = 0.05
gamma = 0.0
schedule = 1
p = 0.5
bigram = True
embedding = 'glove'

from neuronlp2.

XuezheMax avatar XuezheMax commented on August 17, 2024

Hi Manni,
Here is my logs for the first few epochs.
loading embedding: glove from data/glove/glove.6B/glove.6B.100d.gz
2017-11-18 17:47:27,021 - NERCRF - INFO - Creating Alphabets
2017-11-18 17:47:27,041 - Create Alphabets - INFO - Word Alphabet Size (Singleton): 23598 (8122)
2017-11-18 17:47:27,041 - Create Alphabets - INFO - Character Alphabet Size: 86
2017-11-18 17:47:27,041 - Create Alphabets - INFO - POS Alphabet Size: 47
2017-11-18 17:47:27,041 - Create Alphabets - INFO - Chunk Alphabet Size: 19
2017-11-18 17:47:27,041 - Create Alphabets - INFO - NER Alphabet Size: 18
2017-11-18 17:47:27,041 - NERCRF - INFO - Word Alphabet Size: 23598
2017-11-18 17:47:27,041 - NERCRF - INFO - Character Alphabet Size: 86
2017-11-18 17:47:27,041 - NERCRF - INFO - POS Alphabet Size: 47
2017-11-18 17:47:27,041 - NERCRF - INFO - Chunk Alphabet Size: 19
2017-11-18 17:47:27,041 - NERCRF - INFO - NER Alphabet Size: 18
2017-11-18 17:47:27,042 - NERCRF - INFO - Reading Data
Reading data from data/conll2003/english/eng.train.bioes.conll
reading data: 10000
Total number of data: 14987
Reading data from data/conll2003/english/eng.dev.bioes.conll
Total number of data: 3466
Reading data from data/conll2003/english/eng.test.bioes.conll
Total number of data: 3684
oov: 339
2017-11-18 17:47:31,883 - NERCRF - INFO - constructing network...
2017-11-18 17:47:32,315 - NERCRF - INFO - Network: LSTM, num_layer=1, hidden=256, filter=30, tag_space=128, crf=bigram
2017-11-18 17:47:32,315 - NERCRF - INFO - training: l2: 0.000000, (#training data: 14987, batch: 16, dropout: 0.50, unk replace: 0.00)
Epoch 1 (LSTM(std), learning rate=0.0100, decay rate=0.0500 (schedule=1)):
train: 937 loss: 2.3267, time: 50.82s
dev acc: 97.04%, precision: 88.88%, recall: 85.93%, F1: 87.38%
best dev acc: 97.04%, precision: 88.88%, recall: 85.93%, F1: 87.38% (epoch: 1)
best test acc: 96.16%, precision: 84.89%, recall: 82.95%, F1: 83.91% (epoch: 1)
Epoch 2 (LSTM(std), learning rate=0.0095, decay rate=0.0500 (schedule=1)):
train: 937 loss: 0.8332, time: 50.86s
dev acc: 97.66%, precision: 90.54%, recall: 89.26%, F1: 89.90%
best dev acc: 97.66%, precision: 90.54%, recall: 89.26%, F1: 89.90% (epoch: 2)
best test acc: 96.83%, precision: 86.99%, recall: 86.51%, F1: 86.75% (epoch: 2)
Epoch 3 (LSTM(std), learning rate=0.0091, decay rate=0.0500 (schedule=1)):
train: 937 loss: 0.6846, time: 33.49s
dev acc: 98.10%, precision: 92.61%, recall: 90.86%, F1: 91.73%
best dev acc: 98.10%, precision: 92.61%, recall: 90.86%, F1: 91.73% (epoch: 3)
best test acc: 97.35%, precision: 88.87%, recall: 87.98%, F1: 88.42% (epoch: 3)
Epoch 4 (LSTM(std), learning rate=0.0087, decay rate=0.0500 (schedule=1)):
train: 937 loss: 0.5764, time: 39.42s
dev acc: 98.29%, precision: 92.49%, recall: 92.07%, F1: 92.28%
best dev acc: 98.29%, precision: 92.49%, recall: 92.07%, F1: 92.28% (epoch: 4)
best test acc: 97.46%, precision: 88.24%, recall: 88.60%, F1: 88.42% (epoch: 4)

It seems that there are some issues in your data. Please make sure that you follows the data format describe here #2
and use the BIOES tagging schema.
When you are trying to re-run the code, please first remove the vocabulary dir in data/alphabets/ner_crf/ so that the new vocabulary can be rebuilt.
Thanks.

from neuronlp2.

XuezheMax avatar XuezheMax commented on August 17, 2024

Moreover, PyTorch has some implicit parameter initialization which makes the training of the first few epochs unstable. When you see a pretty large loss at the beginning of the training (like 50+), just kill the program and re-run it :)

from neuronlp2.

ManniSingh avatar ManniSingh commented on August 17, 2024

Hi Max,

  • Cleaned everything
  • Error starts < 15
  • I converted data to BIOES
  • Did not remove "-docstart-" sentences, i noticed from your log that says, "Total number of data: 14987".

But still (1% improvement i got):

Epoch 100 (LSTM(std), learning rate=0.0025, decay rate=0.0500 (1)):
1499/1499 [===========================>] - ETA: 8s - train loss: 0.0395, time: 126.78s
dev acc: 97.68%, precision: 89.58%, recall: 87.28%, F1: 88.42%
best dev acc: 97.87%, precision: 90.29%, recall: 88.44%, F1: 89.36% (epoch: 21)
best test acc: 96.17%, precision: 83.23%, recall: 81.80%, F1: 82.51% (epoch: 21)

from neuronlp2.

XuezheMax avatar XuezheMax commented on August 17, 2024

from neuronlp2.

ManniSingh avatar ManniSingh commented on August 17, 2024

HI Max,
I did clone, that is why it is working.

Following is the log:

loading embedding: glove from /data/manni/ner/glove/glove.6B.100d.txt.gz
2017-11-19 09:08:22,562 - NERCRF - INFO - Creating Alphabets
2017-11-19 09:08:22,565 - Create Alphabets - INFO - Creating Alphabets: /data/manni/alphabets/ner_crf/
2017-11-19 09:08:23,173 - Create Alphabets - INFO - Total Vocabulary Size: 23625
2017-11-19 09:08:23,175 - Create Alphabets - INFO - TOtal Singleton Size: 11641
2017-11-19 09:08:23,180 - Create Alphabets - INFO - Total Vocabulary Size (w.o rare words): 11984
2017-11-19 09:08:23,351 - Create Alphabets - INFO - Word Alphabet Size (Singleton): 11985 (0)
2017-11-19 09:08:23,352 - Create Alphabets - INFO - Character Alphabet Size: 86
2017-11-19 09:08:23,353 - Create Alphabets - INFO - POS Alphabet Size: 47
2017-11-19 09:08:23,354 - Create Alphabets - INFO - Chunk Alphabet Size: 19
2017-11-19 09:08:23,355 - Create Alphabets - INFO - NER Alphabet Size: 18
2017-11-19 09:08:23,357 - NERCRF - INFO - Word Alphabet Size: 11985
2017-11-19 09:08:23,358 - NERCRF - INFO - Character Alphabet Size: 86
2017-11-19 09:08:23,359 - NERCRF - INFO - POS Alphabet Size: 47
2017-11-19 09:08:23,360 - NERCRF - INFO - Chunk Alphabet Size: 19
2017-11-19 09:08:23,361 - NERCRF - INFO - NER Alphabet Size: 18
2017-11-19 09:08:23,362 - NERCRF - INFO - Reading Data
Reading data from /data/manni/ner/conll2003/eng.bioes.train
reading data: 10000
Total number of data: 14987
Reading data from /data/manni/ner/conll2003/eng.bioes.testa
Total number of data: 3466
Reading data from /data/manni/ner/conll2003/eng.bioes.testb
Total number of data: 3684
oov: 11984
2017-11-19 09:08:36,198 - NERCRF - INFO - constructing network...
2017-11-19 09:08:36,776 - NERCRF - INFO - Network: LSTM, num_layer=1, hidden=200, filter=30, tag_space=128, crf=bigram
2017-11-19 09:08:36,777 - NERCRF - INFO - training: l2: 0.000000, (#training data: 14987, batch: 10, dropout: 0.50)

After Epoch 100:

Epoch 100 (LSTM(std), learning rate=0.0025, decay rate=0.0500 (1)):
dev acc: 97.68%, precision: 89.58%, recall: 87.28%, F1: 88.42%
best dev acc: 97.87%, precision: 90.29%, recall: 88.44%, F1: 89.36% (epoch: 21)
best test acc: 96.17%, precision: 83.23%, recall: 81.80%, F1: 82.51% (epoch: 21)

from neuronlp2.

XuezheMax avatar XuezheMax commented on August 17, 2024

from neuronlp2.

ManniSingh avatar ManniSingh commented on August 17, 2024
  • I lowered words in "conll03_data.py"
  • Also did same in "reader.py"
  • Now the oov is 476 and getting F1:91+
    But why OOV is so less (I have "normalize_digits=False")? Are you doing Masking somewhere?

The orignal composition of glove to conll2003 i calculated is:

Glove Vocab length:400000
Vocab length:21009
Vocab length:9002
Vocab length:8548
Total Vocab: 26869
Total OOV: 3922

Now the result is:

Epoch 41 (LSTM(std), learning rate=0.0050, decay rate=0.0500 (1)):
1499/1499 [===========================>] - ETA: 8s - train: 61459 loss: 0.1576, time: 126.63s
dev acc: 98.81%, precision: 94.80%, recall: 94.58%, F1: 94.69%
best dev acc: 98.81%, precision: 94.80%, recall: 94.58%, F1: 94.69% (epoch: 41)
best test acc: 97.93%, precision: 91.26%, recall: 90.92%, F1: 91.09% (epoch: 41)

from neuronlp2.

XuezheMax avatar XuezheMax commented on August 17, 2024

from neuronlp2.

ManniSingh avatar ManniSingh commented on August 17, 2024

I noticed that in the code. But, to my knowledge, the 6B GloVe is uncased. Also, there are many non-word elements (Digits, Punkts etc.) in Conll2003 English dataset are you masking those?

from neuronlp2.

XuezheMax avatar XuezheMax commented on August 17, 2024

from neuronlp2.

ManniSingh avatar ManniSingh commented on August 17, 2024

So does that mean, you considering them(singletons) "unk" ?
BTW, That seems a good idea for generalisation!

from neuronlp2.

XuezheMax avatar XuezheMax commented on August 17, 2024

from neuronlp2.

ManniSingh avatar ManniSingh commented on August 17, 2024

Great! Thanks i will try that.

from neuronlp2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.