galsang / bimpm-pytorch Goto Github PK

Re-implementation of BIMPM (Bilateral Multi-Perspective Matching for Natural Language Sentences, Zhiguo Wang et al.) on Pytorch.

Python 100.00%

pytorch snli bimpm python

bimpm-pytorch's Introduction

Assisant Professor at Hanyang University

Personal webpage: https://galsang.github.io/

Useful Links

bimpm-pytorch's People

Contributors

Stargazers

Watchers

bimpm-pytorch's Issues

About the vocabulary constructed on snli

I see this code:
self.TEXT.build_vocab(self.train, self.dev, self.test, vectors=GloVe(...))
As i know, we should constrcuct vocabulary only on trainset?

Size([400, 350]) instead of ([400, 300])

Greetings,

I counter this error when try to run your code without changing anything.
Could you have any ideas to fix this?

RuntimeError: While copying the parameter named context_LSTM.weight_ih_l0, whose dimensions in the model are torch.Size([400, 350]) and whose dimensions in the checkpoint are torch.Size([400, 300]).

Memory error

Hi, Thanks for converting the model to pytorch. But I always got memory error by using this tool, I checked one of the existing issues and set the --max-sent-len to 32. But still got the same error. What else can I try to solve it?

$ python train.py --data-type Quora --max-sent-len 32
loading Quora data...
100%|████████████████████████████████████████████████████████████████████████████| 2196017/2196017 [02:21<00:00, 15546.98it/s]
Traceback (most recent call last):
  File "train.py", line 148, in <module>
    main()
  File "train.py", line 127, in main
    data = Quora(args)
  File "/home/vba/dev/external/BIMPM-pytorch/model/utils.py", line 70, in __init__
    self.TEXT.build_vocab(self.train, self.dev, self.test, vectors=GloVe(name='840B', dim=300))
  File "/home/vba/anaconda3/lib/python3.6/site-packages/torchtext/vocab.py", line 347, in __init__
    super(GloVe, self).__init__(name, url=url, **kwargs)
  File "/home/vba/anaconda3/lib/python3.6/site-packages/torchtext/vocab.py", line 236, in __init__
    self.cache(name, cache, url=url)
  File "/home/vba/anaconda3/lib/python3.6/site-packages/torchtext/vocab.py", line 327, in cache
    self.vectors = torch.Tensor(vectors).view(-1, dim)
MemoryError

RuntimeError: Cannot initialize CUDA without ATen_cuda library

RuntimeError: Cannot initialize CUDA without ATen_cuda library. PyTorch splits its backend into two shared libraries: a CPU library and a CUDA library; this error has occurred because you are trying to use some CUDA functionality, but the CUDA library has not been loaded by the dynamic linker for some reason. The CUDA library MUST be loaded, EVEN IF you don't directly use any symbols from the CUDA library! One common culprit is a lack of -Wl,--no-as-needed in your link arguments; many dynamic linkers will delete dynamic library dependencies if you don't depend on any of their symbols. You can check if this has occurred by using ldd on your binary to see if there is a dependency on *_cuda.so library

As far as I googled, when working with GPU it doesn't causes these sort of issues. But currently I am using an CPU PyTorch. Any lead to solve this issue would be helpful.

Memory issues

There are some memory issues that require a large amount of memory to run the code.
It will be optimized as soon as possible.

Improving performance

I have trained a model and I am able to get way better results than the previous method that I was using. But now I face performance issues. For some queries, the model takes more than 6 seconds to do prediction. Any heads up to tuning the prediction time would be helpful! :)

self.TEXT.build_vocab(self.train, self.dev, self.test, vectors=GloVe(name='840B', dim=300)) ,the error always exists.

Traceback (most recent call last):
File "train.py", line 148, in
main()
File "train.py", line 127, in main
data = Quora(args)
File "/home/ner/BIMPM-pytorch/model/utils.py", line 70, in init
self.TEXT.build_vocab(self.train, self.dev, self.test, vectors=GloVe(name='840B', dim=300))
File "/opt/conda/envs/bimpm/lib/python3.6/site-packages/torchtext/vocab.py", line 324, in init
super(GloVe, self).init(name, url=url, **kwargs)
File "/opt/conda/envs/bimpm/lib/python3.6/site-packages/torchtext/vocab.py", line 222, in init
self.cache(name, cache, url=url)
File "/opt/conda/envs/bimpm/lib/python3.6/site-packages/torchtext/vocab.py", line 242, in cache
urlretrieve(url, dest, reporthook=reporthook(t))
File "/opt/conda/envs/bimpm/lib/python3.6/urllib/request.py", line 289, in urlretrieve
% (read, size), result)
urllib.error.ContentTooShortError: <urlopen error retrieval incomplete: got only 1078111416 out of 2176768927 bytes>

The hyper parameter for SNLI and QQP

Can you release the hyper parameter for SNLI and QQP?

Thanks.

Access to pretrained model

Hey, Hope you are doing good!!

Excellent work and model.
Can I get a location for pre-trained models so that I don't have to re-train?

Thanks in Advance!!

what the grammer about f'XXXXXXXXXXXXXXXXX', i means the f

Less accuracy in reimplementation

Hi,

I noticed you are only limiting the number of words in the train phase, not in test/dev phase. Could it be the reason the reimplementation accuracies are less than the reported ones from the papers? or they do the same?

Thanks,
Amir

AttributeError: 'BIMPM' object has no attribute 'mp_w1'

mldl@mldlUB1604:~/ub16_prj/BIMPM-pytorch$ python3 train.py
loading SNLI data...
downloading snli_1.0.zip
extracting
.vector_cache/glove.840B.300d.zip: 2.18GB [10:50, 3.35MB/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2196017/2196017 [03:46<00:00, 9702.68it/s]
training start!
epoch: 1
Traceback (most recent call last):
File "train.py", line 147, in
main()
File "train.py", line 137, in main
best_model = train(args, data)
File "train.py", line 67, in train
pred = model(**kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/home/mldl/ub16_prj/BIMPM-pytorch/model/BIMPM.py", line 265, in forward
mv_p_full_fw = mp_matching_func(con_p_fw, con_h_fw[:, -1, :], self.mp_w1)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 366, in getattr
type(self).name, name))
AttributeError: 'BIMPM' object has no attribute 'mp_w1'

RuntimeError: Expected object of backend CUDA but got backend CPU for argument #3 'index'

Hi, when I train Quora dataset I got errors like below:

epoch: 1
Traceback (most recent call last):
  File "train.py", line 151, in <module>
    main()
  File "train.py", line 141, in main
    best_model = train(args, data)
  File "train.py", line 70, in train
    pred = model(**kwargs)
  File "/media/gaoya/disk/Applications/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/gaoya/disk/Applications/pytorch/NLP/BIMPM-pytorch-master/model/BIMPM.py", line 219, in forward
    p = self.word_emb(kwargs['p'])
  File "/media/gaoya/disk/Applications/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/gaoya/disk/Applications/anaconda3/lib/python3.7/site-packages/torch/nn/modules/sparse.py", line 114, in forward
    self.norm_type, self.scale_grad_by_freq, self.sparse)
  File "/media/gaoya/disk/Applications/anaconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 1467, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected object of backend CUDA but got backend CPU for argument #3 'index'

Can someone help me fix it?