lanwuwei / spm_toolkit Goto Github PK
View Code? Open in Web Editor NEWNeural network toolkit for sentence pair modeling.
Neural network toolkit for sentence pair modeling.
Thanks for sharing your great work!
I am trying to use DecAtt with quora dataset , but the model can not train like you?
the prediction always be 0 , so I print the attention matrix.
raw_attentions = torch.matmul(repr1, repr2) print(raw_attentions)
At beginning, it is filled with big number like this
tensor([[[ 141.6384, 3.2162, 3.8512, ..., 3.0097, 146.9306, 152.0328], [ 3.8926, 0.2856, 0.3139, ..., 0.2734, 5.9284, 6.0491], [ 3.8910, 0.2930, 0.3183, ..., 0.2677, 7.0301, 6.9951], ...,
after many batches it amazingly to be all zero !!
`tensor([[[ 0., 0., 0., ..., 0., 0., 0.],
[ 0., 0., 0., ..., 0., 0., 0.],
[ 0., 0., 0., ..., 0., 0., 0.],
...,
[ 0., 0., 0., ..., 0., 0., 0.],
[ 0., 0., 0., ..., 0., 0., 0.],
[ 0., 0., 0., ..., 0., 0., 0.]],
[[ 0., 0., 0., ..., 0., 0., 0.],
[ 0., 0., 0., ..., 0., 0., 0.],
[ 0., 0., 0., ..., 0., 0., 0.],
...,`
Hi
I have a few questions related to the ESIM model for NLI:
It would be great if you could clear these doubts.
Thanks!
First,i have not found this function 'readSTSdata()' in your code,
Secondly,When the model is initialized, there is a corresponding classification,Such as STS(6), TrecQA(2),but in the end it becomes 2 classes by default in the forward function.Is it really right?
Finally,How do you solve the similarity problem with the classified code? Some of the code is not very clear.
SPM_toolkit/DecAtt/main_quora.py
Line 56 in 3e2cb35
cat(): argument 'tensors' (position 1) must be tuple of Tensors, not generator
can I fix this bug use this line?
torch.cat(tuple([dict[word].view(1, -1) for word in lsent]))
model=torch.load('model.pkl')
model.eval()
en='This argument is irrational and lacks objectivity.'
it='Tale argomentazione manca di razionalità e di obiettività.'
eng=ps.handle_sen(en,'english')
iti=ps.handle_sen(it,'italian')
a,b=model(eng,iti,1)
running main.py and it throw out "cannot import name 'load_word_vectors'"
did not have this function?
Hi, I am recently reimplementing the SSE model and I am confused how you pre-process the quora_duplicate_questions.tsv:
I found a trained ESIM SNLI model in the Google drive folder where preprocessed data is available.
I loaded that model and tested it on the preprocessed test data but got a test accuracy of 58% only. Can someone please tell me if that model is trained or I need to train a model from scratch and then test it?
Traceback (most recent call last):
File "main_sts.py", line 893, in
main(args)
File "main_sts.py", line 466, in main
combine_mode, lm_mode)#, corpus)
TypeError: init() takes exactly 24 arguments (23 given)
请问这个问题怎么解决呢 我看是模型里缺了一个deepCNN的参数 这个应该怎么赋值呢 谢谢
I have torchtext 0.1.1 , python 2.7
but function "load_word_vectors" would not download the correct zip file (the 2.18GB one)
when it starts downloading
"glove.840B.300d: 8.19kB [00:00, 9.51kB/s]"
and it always ends up with a bad zip file
was I missing anything?
Hi Wuwei,
Thanks for sharing the code and learned from your code! I have a small question below:
I found that for other models you use from torchtext.vocab import load_word_vectors
to load pretrained word vectors while for SSE model, you directly use torch.load(EMB_FILE)
to load the pickled embeddings. However, I could not find the embedding files in the data folder you provided. Could you please upload this file?
Thanks!
Regards,
Shuailong
Hi Wuwei,
Thanks for sharing the code. I have a question on the results in Table 4 of the original paper. Are those results(let's say acc of PWIM on Quora is 0.834) on training set or test set? I noticed that the results in Table 4 are quite corresponding to the results of the training curves in Figure 3, so just want to make sure. Thank you!
how did you generate a.toks, b.toks and what kind of preprocessing did you apply to questions?
I found that quora_vocab_cased.pkl would be loaded in line 155 in ESIM/main_batch_quora.py, in the beginning of that program.
When I try to run ESIM code, I got stuck in that file(quora_vocab_cased.pkl)
I found this file in the drive folder you share in DecAtt/README.
But currently I have a sentence matching task, how could I generate a file like quora_vocab_cased.pkl?
I have not found any code about generating that file.
some dataset i donot konw where to download
Excuse me,i have a question about the PWIM model.The more training rounds, the better the effect of STS tasks.Is it all right?
I try to run the data_loader.py file in the SSE, but I find there is no mnli file,can u tell me where to import the right file?
Hi,
I run the main_snli.py, but got an error message in
def create_batch(data,from_index, to_index):
I changed the following sentence
left_sents = torch.cat((dict[word].view(1, -1) for word in lsent))
to
left_sents = torch.cat([dict[word].view(1, -1) for word in lsent])
Now i have 1000000 sentence pairs,which throw out the same meaning.when i use those data to train the sentence model,i saved the model static pkl. But i use the trained model to eval new sentence pair,almost all of them get the score(1.0) .
what should i do?can you give me some advice!
I got this error while trying to run main_quora.py in SSE. Where can I find this file or how do I create it? Thanks!
RuntimeError: invalid argument 2: size '[-1 x 300]' is invalid for input with 270338015 elements at /pytorch/aten/src/TH/THStorage.c:37
I used the same sentence pair into the PWIM model, but each time the result is different, why is this?
Hi
I am trying to train the ESIM SNLI model on my own dataset where some premise lengths are very long (around 5500). I increased the maximum length but then my GPU is not able to handle the the data with 64 batch size.
I have multiple GPU environment and set CUDA_VISIBLE_DEVICES variable to multiple GPUs but the code still uses one GPU.
I also wrapped the model in DataParallel pipeline shown here (https://pytorch.org/tutorials/beginner/blitz/data_parallel_tutorial.html) as follows:
model = ESIM(dim_word, 3, n_words, dim_word, pretrained_emb)
model=nn.DataParallel(model)
if torch.cuda.is_available():
model = model.cuda()
criterion = criterion.cuda()
I get the this error on doing so:
ValueError: Expected input batch_size (256) to match target batch_size (64)
I am not able to understand why target batch size isn't changed when multiple GPUs are used.
Can anyone tell me how to use multiple GPUs to train the ESIM SNLI model? Or if there is any other way to handle large sequence length in the model?
Hi! I'm trying to run ESIM and DecATT on Quora dataset. What folder structure and files are you using? If the files are not downloadable, please explain how/where they are generated. Thanks!
Hi lanwu, thanks for your reproduction of DecAtt! I'm confused on the linear_layer_intra in _transformation_input. Is it defined anywhere else? I'd appreciate it if you could solve my problem, thanks a lot!
Hi, I am trying to reproduce your result in the paper: "Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering", where you got 86.6% test acc on Quora dataset with the inferSent model. I wonder whether you use Pre-trained word embedding or not, because the facebook group use Pre-trained word embedding in their paper: "Supervised Learning of Universal Sentence Representations from Natural Language Inference Data". Thanks.
Is this problem about the torchtext??
ImportError: cannot import name 'load_word_vectors'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.