lukecq1231 / nli Goto Github PK

View Code? Open in Web Editor NEW

262.0 262.0 71.0 36 KB

Enhanced LTSM for natural language inference

License: Apache License 2.0

Python 99.43% Shell 0.57%

nli's People

Contributors

Stargazers

Watchers

nli's Issues

Training time

Hello, I am interested to this model and want to know its training time of one epoch in SNLI dataset. And how many epoches does it need to reach the convergence state?

run on google colab

hello!
I have python 2 and theano 0.8.2.
I want run project in google colab.
I encounter the following error:

Theano does not recognise this flag: CUDA_DIR
warnings.warn('Theano does not recognise this flag: {0}'.format(key))

I set device=cuda0 , but I see the following error:
ERROR (theano.sandbox.gpuarray): pygpu was configured but could not be imported

now, I run the code below:

!wget -c https://repo.continuum.io/archive/Anaconda2-5.1.0-Linux-x86_64.sh
!chmod +x Anaconda2-5.1.0-Linux-x86_64.sh
!bash ./Anaconda2-5.1.0-Linux-x86_64.sh -b -f -p /usr/local
!conda install theano pygpu

but , I have the following error:
ERROR (theano.gpuarray): Could not initialize pygpu, support disabled
....

Please check the implementation of the project on Google colab. I want to run Kim and ESIM projects on Google colab, but both give the same errors.
please help me.

Use Dynamic embeddings with ESIM stack

Is there a way to replace GloVe embeddings in ESIM stack with BERT or any other dynamic embeddings?

save model

Hello.
I lowered the amount of data so I can run with CPU. But at last the model did not run and I encountered the following error. what is the reason.

Building optimizers...

Hello
I run your project, but it's been there for several hours. what is the reason?

Question

Hello
I want to run this code and I want to reduce the number of samples first, for example, on 1000 samples.
I want to reduce the amount of dataset.
What changes should I make in the code?
Which files do I change?
thanks for your help

ESIM using keras

Hi
Since I don't have access to GPU, I can't execute your code, but there is another code in the github that implements your model with the keras Library . Are you confirming the following code and correct?

"""
Implementation of ESIM(Enhanced LSTM for Natural Language Inference)
https://arxiv.org/abs/1609.06038
"""
import numpy as np
from keras.layers import *
from keras.activations import softmax
from keras.models import Model

def StaticEmbedding(embedding_matrix):
in_dim, out_dim = embedding_matrix.shape
return Embedding(in_dim, out_dim, weights=[embedding_matrix], trainable=False)

def subtract(input_1, input_2):
minus_input_2 = Lambda(lambda x: -x)(input_2)
return add([input_1, minus_input_2])

def aggregate(input_1, input_2, num_dense=300, dropout_rate=0.5):
feat1 = concatenate([GlobalAvgPool1D()(input_1), GlobalMaxPool1D()(input_1)])
feat2 = concatenate([GlobalAvgPool1D()(input_2), GlobalMaxPool1D()(input_2)])
x = concatenate([feat1, feat2])
x = BatchNormalization()(x)
x = Dense(num_dense, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(dropout_rate)(x)
x = Dense(num_dense, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(dropout_rate)(x)
return x

def align(input_1, input_2):
attention = Dot(axes=-1)([input_1, input_2])
w_att_1 = Lambda(lambda x: softmax(x, axis=1))(attention)
w_att_2 = Permute((2,1))(Lambda(lambda x: softmax(x, axis=2))(attention))
in1_aligned = Dot(axes=1)([w_att_1, input_1])
in2_aligned = Dot(axes=1)([w_att_2, input_2])
return in1_aligned, in2_aligned

def build_model(embedding_matrix, num_class=1, max_length=30, lstm_dim=300):
q1 = Input(shape=(max_length,))
q2 = Input(shape=(max_length,))

# Embedding
embedding = StaticEmbedding(embedding_matrix)
q1_embed = BatchNormalization(axis=2)(embedding(q1))
q2_embed = BatchNormalization(axis=2)(embedding(q2))

# Encoding
encode = Bidirectional(LSTM(lstm_dim, return_sequences=True))
q1_encoded = encode(q1_embed)
q2_encoded = encode(q2_embed)

# Alignment
q1_aligned, q2_aligned = align(q1_encoded, q2_encoded)

# Compare
q1_combined = concatenate([q1_encoded, q2_aligned, subtract(q1_encoded, q2_aligned), multiply([q1_encoded, q2_aligned])])
q2_combined = concatenate([q2_encoded, q1_aligned, subtract(q2_encoded, q1_aligned), multiply([q2_encoded, q1_aligned])]) 
compare = Bidirectional(LSTM(lstm_dim, return_sequences=True))
q1_compare = compare(q1_combined)
q2_compare = compare(q2_combined)

# Aggregate
x = aggregate(q1_compare, q2_compare)
x = Dense(num_class, activation='sigmoid')(x)

return Model(inputs=[q1, q2], outputs=x)

link github: https://gist.github.com/namakemono/b74547e82ef9307da9c29057c650cdf1

error

Hi, What is the cause of the following error?

Please respond faster.
Thanks

NaN detected

Hi, I am getting a NaN detected error. Just running your scripts with minor adaptations, i.e.:

theano 0.10-dev (bleeding edge)
python3 (basically just changing some "print", "xrange" commands in the code)

Training runs fine until Epoch 5 Update 91000 ...

GPU

If I want to use GPU, I will deal with this problem.

While Google colab uses Tesla k80, what should I do to get the code running on the GPU?

tree lstm

Hi,
I am checking the implementation, and I couldnt find the parts related to the tree lstm. are you planning to release that part too?
thanks

lstm_layer mask

Hello, I'm following your work, and try to reimplement ESIM by tensorflow.
I noticed in you lstm_layer() , you masked c and h, I'm wondering how much will the mask improve the model compared with the no-mask(just basic LSTM).
And how much will the ortho_weight help?
Thank you so much.

lukecq1231 / nli Goto Github PK

nli's People

Contributors

Stargazers

Watchers

Forkers

nli's Issues

Recommend Projects

Recommend Topics

Recommend Org