Comments (7)
The error message you have comes from this:
import numpy as np
embeddings = [np.zeros((64, 4096)), np.zeros((64, 4096))]
embeddings = np.vstack(embeddings) # no error
embeddings = [np.zeros((64, 4096)), np.zeros((64, 4096)), np.zeros((64, 3412))]
embeddings = np.vstack(embeddings) # error
# -> ValueError: all the input array dimensions except for the concatenation axis must match exactly
For some reasons, one of the element in "embeddings" is not of size (batch_size=128, emb_dim=4096). So there must be one or more element of size different than (128, 4096).
- Just before the error in line 209, could you print the shape of each element in embeddings?
for batch in embeddings:
print(batch.shape)
to see if we can spot the element with the wrong size.
-
What is in "sentences"? Can you check that you don't have an empty sentence?
-
what is the length of "sentences" ?
-
Could you update pytorch to a more recent version and see if you still have the issue?
from infersent.
Thanks for the quick response.
I believe that I'm on the latest torch version of 0.1.12_1. Is there a later version?
sentences length is is 9815 and there are no 0 length sentences in the array.
This is the output from just before the line 209:
Nb words kept : 128201/130068 (98.56 %)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 64, 4096)
(1, 23, 4096)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-8-3d88dd6254e6> in <module>()
1 tmp = sentences[:128]
----> 2 model.encode(sentences, tokenize=False, verbose=True)
/home/brian/InferSent/encoder/models.py in encode(self, sentences, bsize, tokenize, verbose)
210 for batch in embeddings:
211 print(batch.shape)
--> 212 embeddings = np.vstack(embeddings)
213
214 # unsort
/home/brian/anaconda3/envs/py2/lib/python2.7/site-packages/numpy/core/shape_base.pyc in vstack(tup)
235
236 """
--> 237 return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
238
239 def hstack(tup):
ValueError: all the input array dimensions except for the concatenation axis must match exactly
from infersent.
Oh ok I get it. Can you try to change the line in models.py here: https://github.com/facebookresearch/InferSent/blob/master/encoder/models.py#L67
emb = torch.max(sent_output, 0)[0]
into:
emb = torch.max(sent_output, 0)[0].squeeze(0)
and see if this works then?
from infersent.
That's working now. Thanks! I wonder why this didn't show up before?
from infersent.
@briandw So this is an issue linked to the change of policy in pytorch functions such as max, mean, sum etc.
If you have a tensor of size (say) (23, 128, 4096). If you take the torch.max (or torch.mean ..) over the first dimension, then you get a tensor of size:
(128, 4096) for recent versions of pytorch
(1, 128, 4096) for old versions of pytorch
So it means your version of pytorch is too old. I will update the requirement part in the README, and add an exception in the models.py to handle this case.
Thanks
from infersent.
from infersent.
setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (9815,) + inhomogeneous part.
getting this error on
embeddings = infersent.encode(sentences, bsize=128, tokenize=False, verbose=True)
print('nb sentences encoded : {0}'.format(len(embeddings)))
from infersent.
Related Issues (20)
- Can not train from scratch? HOT 1
- link for the pre-trained models? HOT 2
- Decode sentence embedding to sentence HOT 1
- How can I train a model or modify the network to do it?
- adding train_nli back
- Error while loading the pretrained model HOT 1
- Usage on Google Collab HOT 2
- Same sentence, different encoding! HOT 1
- Can you save a model with vocab? HOT 1
- RuntimeError: [enforce fail at CPUAllocator.cpp:64] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 24776638464 bytes. Error code 12 (Cannot allocate memory)
- Cuda by Default
- Does this work anymore? HOT 2
- Vector Concatenation Question
- InferSent for Spanish
- ModuleNotFoundError: No module named 'models' HOT 1
- Infersent for Indian language
- How we fine tune inferSent for calassification problem like sentiment analysisi?
- KeyError: '</s>'
- extract features Infersent error HOT 1
- inconsistent sentences length causes encoding failure
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from infersent.