galsang / bimpm-pytorch Goto Github PK
View Code? Open in Web Editor NEWRe-implementation of BIMPM (Bilateral Multi-Perspective Matching for Natural Language Sentences, Zhiguo Wang et al.) on Pytorch.
Re-implementation of BIMPM (Bilateral Multi-Perspective Matching for Natural Language Sentences, Zhiguo Wang et al.) on Pytorch.
I see this code:
self.TEXT.build_vocab(self.train, self.dev, self.test, vectors=GloVe(...))
As i know, we should constrcuct vocabulary only on trainset?
Greetings,
I counter this error when try to run your code without changing anything.
Could you have any ideas to fix this?
RuntimeError: While copying the parameter named context_LSTM.weight_ih_l0, whose dimensions in the model are torch.Size([400, 350]) and whose dimensions in the checkpoint are torch.Size([400, 300]).
Hi, Thanks for converting the model to pytorch. But I always got memory error by using this tool, I checked one of the existing issues and set the --max-sent-len to 32. But still got the same error. What else can I try to solve it?
$ python train.py --data-type Quora --max-sent-len 32
loading Quora data...
100%|████████████████████████████████████████████████████████████████████████████| 2196017/2196017 [02:21<00:00, 15546.98it/s]
Traceback (most recent call last):
File "train.py", line 148, in <module>
main()
File "train.py", line 127, in main
data = Quora(args)
File "/home/vba/dev/external/BIMPM-pytorch/model/utils.py", line 70, in __init__
self.TEXT.build_vocab(self.train, self.dev, self.test, vectors=GloVe(name='840B', dim=300))
File "/home/vba/anaconda3/lib/python3.6/site-packages/torchtext/vocab.py", line 347, in __init__
super(GloVe, self).__init__(name, url=url, **kwargs)
File "/home/vba/anaconda3/lib/python3.6/site-packages/torchtext/vocab.py", line 236, in __init__
self.cache(name, cache, url=url)
File "/home/vba/anaconda3/lib/python3.6/site-packages/torchtext/vocab.py", line 327, in cache
self.vectors = torch.Tensor(vectors).view(-1, dim)
MemoryError
RuntimeError: Cannot initialize CUDA without ATen_cuda library. PyTorch splits its backend into two shared libraries: a CPU library and a CUDA library; this error has occurred because you are trying to use some CUDA functionality, but the CUDA library has not been loaded by the dynamic linker for some reason. The CUDA library MUST be loaded, EVEN IF you don't directly use any symbols from the CUDA library! One common culprit is a lack of -Wl,--no-as-needed in your link arguments; many dynamic linkers will delete dynamic library dependencies if you don't depend on any of their symbols. You can check if this has occurred by using ldd on your binary to see if there is a dependency on *_cuda.so library
As far as I googled, when working with GPU it doesn't causes these sort of issues. But currently I am using an CPU PyTorch. Any lead to solve this issue would be helpful.
There are some memory issues that require a large amount of memory to run the code.
It will be optimized as soon as possible.
I have trained a model and I am able to get way better results than the previous method that I was using. But now I face performance issues. For some queries, the model takes more than 6 seconds to do prediction. Any heads up to tuning the prediction time would be helpful! :)
Traceback (most recent call last):
File "train.py", line 148, in
main()
File "train.py", line 127, in main
data = Quora(args)
File "/home/ner/BIMPM-pytorch/model/utils.py", line 70, in init
self.TEXT.build_vocab(self.train, self.dev, self.test, vectors=GloVe(name='840B', dim=300))
File "/opt/conda/envs/bimpm/lib/python3.6/site-packages/torchtext/vocab.py", line 324, in init
super(GloVe, self).init(name, url=url, **kwargs)
File "/opt/conda/envs/bimpm/lib/python3.6/site-packages/torchtext/vocab.py", line 222, in init
self.cache(name, cache, url=url)
File "/opt/conda/envs/bimpm/lib/python3.6/site-packages/torchtext/vocab.py", line 242, in cache
urlretrieve(url, dest, reporthook=reporthook(t))
File "/opt/conda/envs/bimpm/lib/python3.6/urllib/request.py", line 289, in urlretrieve
% (read, size), result)
urllib.error.ContentTooShortError: <urlopen error retrieval incomplete: got only 1078111416 out of 2176768927 bytes>
Can you release the hyper parameter for SNLI and QQP?
Thanks.
Hey, Hope you are doing good!!
Excellent work and model.
Can I get a location for pre-trained models so that I don't have to re-train?
Thanks in Advance!!
Hi,
I noticed you are only limiting the number of words in the train phase, not in test/dev phase. Could it be the reason the reimplementation accuracies are less than the reported ones from the papers? or they do the same?
Thanks,
Amir
mldl@mldlUB1604:~/ub16_prj/BIMPM-pytorch$ python3 train.py
loading SNLI data...
downloading snli_1.0.zip
extracting
.vector_cache/glove.840B.300d.zip: 2.18GB [10:50, 3.35MB/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2196017/2196017 [03:46<00:00, 9702.68it/s]
training start!
epoch: 1
Traceback (most recent call last):
File "train.py", line 147, in
main()
File "train.py", line 137, in main
best_model = train(args, data)
File "train.py", line 67, in train
pred = model(**kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/home/mldl/ub16_prj/BIMPM-pytorch/model/BIMPM.py", line 265, in forward
mv_p_full_fw = mp_matching_func(con_p_fw, con_h_fw[:, -1, :], self.mp_w1)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 366, in getattr
type(self).name, name))
AttributeError: 'BIMPM' object has no attribute 'mp_w1'
Hi, when I train Quora dataset I got errors like below:
epoch: 1 Traceback (most recent call last): File "train.py", line 151, in <module> main() File "train.py", line 141, in main best_model = train(args, data) File "train.py", line 70, in train pred = model(**kwargs) File "/media/gaoya/disk/Applications/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__ result = self.forward(*input, **kwargs) File "/media/gaoya/disk/Applications/pytorch/NLP/BIMPM-pytorch-master/model/BIMPM.py", line 219, in forward p = self.word_emb(kwargs['p']) File "/media/gaoya/disk/Applications/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__ result = self.forward(*input, **kwargs) File "/media/gaoya/disk/Applications/anaconda3/lib/python3.7/site-packages/torch/nn/modules/sparse.py", line 114, in forward self.norm_type, self.scale_grad_by_freq, self.sparse) File "/media/gaoya/disk/Applications/anaconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 1467, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: Expected object of backend CUDA but got backend CPU for argument #3 'index'
Can someone help me fix it?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.