jnhwkim / cbp Goto Github PK
View Code? Open in Web Editor NEWMultimodal Compact Bilinear Pooling for Torch7
License: Other
Multimodal Compact Bilinear Pooling for Torch7
License: Other
hi, I come here because of your answer in the vqa-mcb.
I want to use cbp layer in torch for VQA, but the result is very poor. I am not sure it is whether my Programming errors or not.
Is there something need to pay attention to when we use cbp layer for VQA in torch?
Could you share your torch code with cbp layer for VQA ?
thanks!
Does current version only supports the bilinear pooling of two same-length vectors? Is it possible to support the pooling of (n, dimA) and (n, dimB) --> (n, compact_dimension) ?
The gradient of Signed Square Root is 1 / (2 * sqrt(abs(x))). When x is close to zero, the gradient is explosive.
Hi,
I'm getting poor performance after loading a gmodule containing a cbp for classification.
It works well before I save(using torch.save) and load the module.
I'm thinking something is wrong with the parameters after loading, but I'm not sure what exactly are the module parameters, for my understanding we only sample h and s. Should I reset the module then after each iteration or keep them fixed? Am I missing something that might cause the loading problem?
Thanks!
Hi everyone, thank you for great code.
Can anyone tell me how to run the code with CPU only? I dont have GPU on my laptop and when I run the code, I got error "no CUDA-capable device is detected".
Thanks,
Kien.
Hi. This work CBP is very good at recognition and VQA on the papers.
The idea of feature learning by two-stream and merge them into one should be a general method.
However, I test CBP on local feature matching and the performance is very poor.
Have you test the method on other problem and the performance is not good?
Or is there some special criterion before I insert CBP into the network? (batch size? learning rate?)
To be more precise, I use the following network at the end of my network
local temp_m = nn.ConcatTable()
local A_net = nn.Sequential()
......
A_net:add(nn.Linear(4096,512))
local B_net = nn.Sequential()
......
B_net:add(nn.Linear(4096,512))
temp_m:add(A_net)
temp_m:add(B_net)
model:add(temp_m)
model:add(nn.CompactBilinearPooling(dim,true))
model:add(nn.SignedSquareRoot())
model:add(nn.Normalize(2))
Thank you for your reply.
I find this code can only handle the input whose dimension is 2.
in the paper of Berkeley, they use cbp layer to do attention, the input become 2048_14_14 or 2048_196, if the batchsize is 20, the input of cbp layer will become 20_2048_196 and the input dimension will become 3.
I find assert(false, '# of dimensions > 2') in CompactBilinearPooling.lua.
how to modify the code if the dimemsion of input is 3 or more ?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.