hsiaoyetgun / esim Goto Github PK

TensorFlow implementation of the ESIM model (Enhanced LTSM for natural language inference)

Python 100.00%

esim's Introduction

Notice

There are some problems with this version code (the mask of attention weight [Model.py, line 160-170] and the mask of mean and max [Model.py, line 220-225]), please don't use this code directly!

I’m too busy recently to follow this repo, and I will update this code in my winter vacation (starting from the 26th, Jan).

ESIM

A Tensorflow implementation of Chen-Qian's Enhanced LSTM for Natural Language Inference from ACL 2017.

Dataset

The dataset used for this task is Stanford Natural Language Inference (SNLI). Pretrained GloVe embeddings obtained from common crawl with 840B tokens used for words.

Requirements

Python>=3
NumPy
TensorFlow>=1.8

Usage

Download dataset from Stanford Natural Language Inference, then move snli_1.0_train.jsonl, snli_1.0_dev.jsonl, snli_1.0_test.jsonl into ./SNLI/raw data.

# move dataset to the right place
mkdir -p ./SNLI/raw\ data
mv snli_1.0_*.jsonl ./SNLI/raw\ data

Data preprocessing for convert source data into an easy-to-use format.

python3 Utils.py

Default hyper-parameters have been stored in config file in the path of ./config/config.yaml.

Training model:

python3 Train.py

Test model:

python3 Test.py

esim's People

Contributors

Stargazers

Watchers

esim's Issues

Cannot feed value of shape

thank you for your share . when i run this code. here is an error as follows:
ValueError: Cannot feed value of shape (32, 100) for Tensor 'premise_actual_length:0', which has shape '(?,)'

i know that it's a simple problem,and just the shape is not match. but how to solve it?

what is the accuracy

i just want to konw the accuracy of your ESIM model? Have you got 88%?

AttributeError: 'OneHotEncoder' object has no attribute '_fit_transform'

ub16c9@ub16c9-gpu:/media/ub16c9/fcd84300-9270-4bbd-896a-5e04e79203b7/ub16_prj/ESIM-tf$ python3.6 Train.py
Using TensorFlow backend.
CMD : python3 Train.py --num_epochs 300 --batch_size 32 --dropout_keep_prob 0.5 --clip_value 10 --learning_rate 0.0004 --l2 0.0 --seq_length 100 --optimizer adam --early_stop_step 5000000 --threshold 0 --embedding_size 300 --embedding_normalize 1 --hidden_size 300 --attention_size 300 --eval_batch 1000 --vocab_path data/vocab.txt --embedding_path data/embeddings.pkl --trainset_path data/train.txt --devset_path data/dev.txt --testset_path data/test.txt --save_path ./model/checkpoint --best_path ./model/bestval --log_path ./config/log/log --config_path ./config/config.yaml
Training with following options :
------------- HYPER PARAMETERS -------------
attention_size: 300
batch_size: 32
best_path: ./model/bestval
clip_value: 10.0
config_path: ./config/config.yaml
devset_path: data/dev.txt
dropout_keep_prob: 0.5
early_stop_step: 5000000
embedding_normalize: 1
embedding_path: data/embeddings.pkl
embedding_size: 300
eval_batch: 1000
hidden_size: 300
l2: 0.0
learning_rate: 0.0004
log_path: config/log/log.2019_06_21_08_26_24
n_classes: 3
n_vocab: 47955
num_epochs: 300
optimizer: adam
save_path: ./model/checkpoint
seq_length: 100
testset_path: data/test.txt
threshold: 0
trainset_path: data/train.txt
vocab_dict_size: 47955
vocab_path: data/vocab.txt
embeded_left : (?, 100, 300)
embeded_right : (?, 100, 300)
a_bar : (?, 100, 600)
b_bar : (?, 100, 600)
att_wei : (?, 100, 100)
att_soft_a : (?, 100, 100)
att_soft_b : (?, 100, 100)
a_hat : (?, 100, 600)
b_hat : (?, 100, 600)
a_diff : (?, 100, 600)
a_mul : (?, 100, 600)
m_a : (?, 100, 2400)
m_b : (?, 100, 2400)
v_a : (?, 100, 600)
v_b : (?, 100, 600)
v_a_avg : (?, 600)
v_a_max : (?, 600)
v : (?, 2400)
Loading training and validation data ...
Traceback (most recent call last):
File "Train.py", line 164, in
train()
File "Train.py", line 49, in train
premise_train, premise_mask_train, hypothesis_train, hypothesis_mask_train, y_train = sentence2Index(arg.trainset_path, vocab_dict)
File "/media/ub16c9/fcd84300-9270-4bbd-896a-5e04e79203b7/ub16_prj/ESIM-tf/Utils.py", line 189, in sentence2Index
labelList = enc._fit_transform(labelList)
AttributeError: 'OneHotEncoder' object has no attribute '_fit_transform'
ub16c9@ub16c9-gpu:/media/ub16c9/fcd84300-9270-4bbd-896a-5e04e79203b7/ub16_prj/ESIM-tf$

found bug

thanks for sharing ,but i found something puzzling in Utils.py[line 193],is it a bug?

which has shape '(?,)'

Traceback (most recent call last):
File "Train.py", line 165, in
train()
File "Train.py", line 94, in train
_, batch_loss, batch_acc = sess.run([model.train, model.loss, model.acc], feed_dict=feed_dict)
File "/home/prf/anaconda3/envs/prfenv/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/home/prf/anaconda3/envs/prfenv/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1128, in _run
str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (32, 100) for Tensor 'premise_actual_length:0', which has shape '(?,)'

ESIM中Attention问题

https://github.com/HsiaoYetGun/ESIM/blob/master/Model.py#L169

attentionSoft_b = tf.nn.softmax(tf.transpose(attentionWeights))

这里对attentionWeights进行transpose后，生成的张量的形状为 ( seq_length, seq_length, batch_size )
然后在对上一步的结果进行softmax，tf.nn.softmax默认在最后一个维度作softmax，
那岂不是在batch上作softmax ？求相互指教。

lstm输入问题

lstm需要的输入的应该是[seq_len，batch_size，emb_size]，代码里面需要transpose一下

inquiry on attention part

Hi there,

Thanks for sharing the code. For attention part in model.py, your code is:

attentionSoft_b = tf.nn.softmax(tf.transpose(attentionWeights))
attentionSoft_b = tf.transpose(attentionSoft_b)

while I feel like it should be:
attentionSoft_b = tf.nn.softmax(attentionWeights, axis=1)

or you should indicate the "perm" in transpose function.

Please correct me if I'm wrong, thanks!

tcmalloc

hello!
what is problem?

sourse code

had you perform tree-Lstm with tensorflow?