lanwuwei / spm_toolkit Goto Github PK

View Code? Open in Web Editor NEW

302.0 9.0 70.0 1.25 MB

Neural network toolkit for sentence pair modeling.

Python 100.00%

pytorch paraphrase-identification natural-language-inference semantic-textual-similarity question-answering

spm_toolkit's People

Contributors

Stargazers

Watchers

spm_toolkit's Issues

some question with DecAtt

Thanks for sharing your great work!
I am trying to use DecAtt with quora dataset , but the model can not train like you?
the prediction always be 0 , so I print the attention matrix.
raw_attentions = torch.matmul(repr1, repr2) print(raw_attentions)
At beginning, it is filled with big number like this
tensor([[[ 141.6384, 3.2162, 3.8512, ..., 3.0097, 146.9306, 152.0328], [ 3.8926, 0.2856, 0.3139, ..., 0.2734, 5.9284, 6.0491], [ 3.8910, 0.2930, 0.3183, ..., 0.2677, 7.0301, 6.9951], ...,
after many batches it amazingly to be all zero !!
`tensor([[[ 0., 0., 0., ..., 0., 0., 0.],
[ 0., 0., 0., ..., 0., 0., 0.],
[ 0., 0., 0., ..., 0., 0., 0.],
...,
[ 0., 0., 0., ..., 0., 0., 0.],
[ 0., 0., 0., ..., 0., 0., 0.],
[ 0., 0., 0., ..., 0., 0., 0.]],

    [[ 0.,  0.,  0.,  ...,  0.,  0.,  0.],
     [ 0.,  0.,  0.,  ...,  0.,  0.,  0.],
     [ 0.,  0.,  0.,  ...,  0.,  0.,  0.],
     ...,`

Clarifications regarding ESIM Model code

I have a few questions related to the ESIM model for NLI:

The ESIM paper says all vectors are updated during training including word vectors. Do the pretrained word embeddings from Glove also get updated in your ESIM model code?
The file at ESIM/main_batch_snli.py uses only ESIM model while the file at ESIM/Tree_IM/main_snli.py uses both ESIM and Tree LSTM Models, right?
The num_units in ESIM LSTM is always equal to pretrained embeddings dimension. Is it a necessity? I got dimension errors while trying to change num_units in LSTM.
There is a default parameter max_sentence_length = 30 but it isn't used anywhere in the model_batch.py file. Is there any significance for this? I thought max_len is the parameter that controls sequence length and can be modified in the main_snli.py file.

It would be great if you could clear these doubts.
Thanks!

about STS task

First,i have not found this function 'readSTSdata()' in your code,
Secondly,When the model is initialized, there is a corresponding classification,Such as STS(6), TrecQA(2),but in the end it becomes 2 classes by default in the forward function.Is it really right?
Finally,How do you solve the similarity problem with the classified code? Some of the code is not very clear.

cat(): argument 'tensors' (position 1) must be tuple of Tensors, not generator

SPM_toolkit/DecAtt/main_quora.py

Line 56 in 3e2cb35

left_sents = torch.cat((dict[word].view(1, -1) for word in lsent))

cat(): argument 'tensors' (position 1) must be tuple of Tensors, not generator

can I fix this bug use this line?

torch.cat(tuple([dict[word].view(1, -1) for word in lsent]))

why i had trained PWIN model and saved it,but it could not be used.

model=torch.load('model.pkl')
model.eval()
en='This argument is irrational and lacks objectivity.'
it='Tale argomentazione manca di razionalità e di obiettività.'
eng=ps.handle_sen(en,'english')
iti=ps.handle_sen(it,'italian')
a,b=model(eng,iti,1)

some questions on PWIM

running main.py and it throw out "cannot import name 'load_word_vectors'"
did not have this function?

Tokenize Method for Quora dataset

Hi, I am recently reimplementing the SSE model and I am confused how you pre-process the quora_duplicate_questions.tsv:

I wonder how you generate the /pytorch/DeepPairWiseWord/data/quora/a.tok and b.tok? What tokenized method do you use?
How do you split train/test/dev dataset from the quora_duplicate_questions.tsv? do you use same split as "Zhiguo Wang, Wael Hamza, and Radu Florian. 2017. Bilateral multi-perspective matching for natural language sentences. In Proceedings of IJCAI." and "Neural Paraphrase Identification of Questions with Noisy Pretraining"?
I would appreciate it if you could answer my questions, Thank you.

test acc for ESIM model on snli testset

I just use your trained ESIM model to eval the test acc on snli testset and just get 57.82% acc, which is far below the acc reported on its papar. I don't know why?

Test results on SNLI from trained model

I found a trained ESIM SNLI model in the Google drive folder where preprocessed data is available.
I loaded that model and tested it on the preprocessed test data but got a test accuracy of 58% only. Can someone please tell me if that model is trained or I need to train a model from scratch and then test it?

A question about PWIM-main_sts

Traceback (most recent call last):
File "main_sts.py", line 893, in
main(args)
File "main_sts.py", line 466, in main
combine_mode, lm_mode)#, corpus)
TypeError: init() takes exactly 24 arguments (23 given)

请问这个问题怎么解决呢我看是模型里缺了一个deepCNN的参数这个应该怎么赋值呢谢谢

'utf-8' codec can't decode byte 0x96 in position 12: invalid start byte

load_word_vectors problems

I have torchtext 0.1.1 , python 2.7
but function "load_word_vectors" would not download the correct zip file (the 2.18GB one)
when it starts downloading
"glove.840B.300d: 8.19kB [00:00, 9.51kB/s]"
and it always ends up with a bad zip file

was I missing anything?

ESIM model questions

I see that 500 epochs need to be trained, do I need to train so many epochs, which will take a long time to train.

and I also want to know whether your experimental results close to the original paper.
thanks.

SSE Model pertained word vectors loading

Hi Wuwei,

Thanks for sharing the code and learned from your code! I have a small question below:

I found that for other models you use from torchtext.vocab import load_word_vectors to load pretrained word vectors while for SSE model, you directly use torch.load(EMB_FILE) to load the pickled embeddings. However, I could not find the embedding files in the data folder you provided. Could you please upload this file?

Thanks!

Regards,
Shuailong

Inquiry on results in the original paper

Hi Wuwei,

Thanks for sharing the code. I have a question on the results in Table 4 of the original paper. Are those results(let's say acc of PWIM on Quora is 0.834) on training set or test set? I noticed that the results in Table 4 are quite corresponding to the results of the training curves in Figure 3, so just want to make sure. Thank you!

Preprocessing?

how did you generate a.toks, b.toks and what kind of preprocessing did you apply to questions?

about DecAtt

When i use this model for wikiQA Task,i found that the batch list is difficult.

Why should we resort the length?And The interval of batch_list is not 32.

What is quora_vocab_cased.pkl that ESIM/main_batch_quora.py needs

I found that quora_vocab_cased.pkl would be loaded in line 155 in ESIM/main_batch_quora.py, in the beginning of that program.

When I try to run ESIM code, I got stuck in that file(quora_vocab_cased.pkl)
I found this file in the drive folder you share in DecAtt/README.

But currently I have a sentence matching task, how could I generate a file like quora_vocab_cased.pkl?
I have not found any code about generating that file.

Can you provide some link to download the dataset

some dataset i donot konw where to download

why there is no task sts for pwim?

A question about STS Task

Excuse me,i have a question about the PWIM model.The more training rounds, the better the effect of STS tasks.Is it all right?

ImportError: No module named mnli

I try to run the data_loader.py file in the SSE, but I find there is no mnli file,can u tell me where to import the right file?

error message in main_snli.py

Hi,
I run the main_snli.py, but got an error message in
def create_batch(data,from_index, to_index):

I changed the following sentence
left_sents = torch.cat((dict[word].view(1, -1) for word in lsent))
to
left_sents = torch.cat([dict[word].view(1, -1) for word in lsent])

Very small issue about grammar.

$print "XXX"$ and $print("XXX")$ are all used in the program. I think it is better to keep consistent in python 2.7 grammar or python 3.5 grammar.

If i want to use the sentence pair model to get the similarity between them?

Now i have 1000000 sentence pairs,which throw out the same meaning.when i use those data to train the sentence model,i saved the model static pkl. But i use the trained model to eval new sentence pair,almost all of them get the score(1.0) .
what should i do?can you give me some advice!

ImportError: No module named config

I got this error while trying to run main_quora.py in SSE. Where can I find this file or how do I create it? Thanks!

why there is runtimwError of PWIM

RuntimeError: invalid argument 2: size '[-1 x 300]' is invalid for input with 270338015 elements at /pytorch/aten/src/TH/THStorage.c:37

PWIM Model

I used the same sentence pair into the PWIM model, but each time the result is different, why is this?

Train SNLI model with multiple GPU

I am trying to train the ESIM SNLI model on my own dataset where some premise lengths are very long (around 5500). I increased the maximum length but then my GPU is not able to handle the the data with 64 batch size.
I have multiple GPU environment and set CUDA_VISIBLE_DEVICES variable to multiple GPUs but the code still uses one GPU.
I also wrapped the model in DataParallel pipeline shown here (https://pytorch.org/tutorials/beginner/blitz/data_parallel_tutorial.html) as follows:

model = ESIM(dim_word, 3, n_words, dim_word, pretrained_emb)
model=nn.DataParallel(model)
if torch.cuda.is_available():
model = model.cuda()
criterion = criterion.cuda()
I get the this error on doing so:
ValueError: Expected input batch_size (256) to match target batch_size (64)

I am not able to understand why target batch size isn't changed when multiple GPUs are used.
Can anyone tell me how to use multiple GPUs to train the ESIM SNLI model? Or if there is any other way to handle large sequence length in the model?

Folder structure?

Hi! I'm trying to run ESIM and DecATT on Quora dataset. What folder structure and files are you using? If the files are not downloadable, please explain how/where they are generated. Thanks!

Why there is a layer named "linear_layer_intra"

Hi lanwu, thanks for your reproduction of DecAtt! I'm confused on the linear_layer_intra in _transformation_input. Is it defined anywhere else? I'd appreciate it if you could solve my problem, thanks a lot!

Do you use pre-trained word embedding when training infer-sent model on Quora Dataset?

Hi, I am trying to reproduce your result in the paper: "Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering", where you got 86.6% test acc on Quora dataset with the inferSent model. I wonder whether you use Pre-trained word embedding or not, because the facebook group use Pre-trained word embedding in their paper: "Supervised Learning of Universal Sentence Representations from Natural Language Inference Data". Thanks.

code about ESIM

Is this problem about the torchtext??
ImportError: cannot import name 'load_word_vectors'

lanwuwei / spm_toolkit Goto Github PK

spm_toolkit's People

Contributors

Stargazers

Watchers

Forkers

spm_toolkit's Issues

Recommend Projects

Recommend Topics

Recommend Org