Git Product home page Git Product logo

capsule-network-tutorial's Introduction

capsule-network-tutorial's People

Contributors

higgsfield avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

capsule-network-tutorial's Issues

test accuracy

I don't know why the train accuracy is becoming lower and lower with the epoch decreasing. Besides, the margin loss and reconstruction loss can't be caculated.

Doesn't appear to be training, even after applying every fix in the issues (open+closed)

Please take more care of your repository.

Plenty of PRs unmerged. Suggestions unaddressed. It's not clear which ones to implement and which ones to ignore.

Initial code doesn't run.

After tweaking:

train accuracy: 0.07
train accuracy: 0.17
train accuracy: 0.14
train accuracy: 0.07
train accuracy: 0.07

I don't see how you get the results your notebook claims.

This has proven to be a headache and a frustration. Clearly so much effort and intelligence was put into the construction, why not put in the extra 1% and leave behind a valuable/tidy/clean resource?

MLP Capsule Examples?

Awesome repo. I've been trying to implement an MLP based caps net, where the primary caps analyzes subsets of features in groups, feeding them into MLPs per capsule then pushed to downstream capsules.

For some reason, I am getting terrible results. Do you know what I might be doing wrong?

What are areas that make MLP capsules different from convolutional primary capsules, what should I be aware of? How do I optimize to get good results. I think dynamic routing is essential for establishment of hierarchy, but in the non-image context, I'm unsure of how the affine transformation is helpful.

Can you post code on how one might do this using MLP instead of convolutional layers?

RuntimeError:index_select() and issue about DataParallel

First of all, thanks, its definitely an easy to follow CapsNet tutorial for me as a beginner, but I found an error after running the code:

RuntimeError: index_select(): argument 'index' must be Variable, not torch.cuda.LongTensor

I solved this issue same as gram-ai/capsule-networks#13, in Decoder class :

 masked = masked.index_select(dim=0, index=max_length_indices.squeeze(1).data)

".data" should be removed.

Then I successfully trained on single GPU according to this tutorial, but when I tried to train the net on two GPUs according to PyTorch data parallelism tutorial :

if USE_CUDA:
      print("Let's use %d GPUs" % torch.cuda.device_count())
      capsule_net = nn.DataParallel(capsule_net).cuda()

but it produced an error
AttributeError: 'DataParallel' object has no attribute 'loss'

I'm confused, and if there is any good solution, please tell me, thanks!

(I use python 2.7.12 and pytorch 0.3.0.post4)

I can not make it work with CIFAR10

I wanted to try working on CIFAR10, I modified the channel number =3 in convLayer e the kernel dim = 24 in convLayer. But it is not working in the primaryCaps u = u.view(x.size(0), 32 * 6 * 6, -1) that is giving to me errot :

ipython-input-4-7b4e2b87bd5c> in forward(self, x)
14 print( "PrimaryCaps {}".format(x.size()))
15 #u = u.view(x.size(0), 32 * 6 * 6, -1)
---> 16 u = u.view(x.size(0), 32 * 6 * 6, -1)
17 return self.squash(u)
18
RuntimeError: invalid argument 2: size '[100 x 1152 x -1]' is invalid for input with 204800 elements at /pytorch/aten/src/TH/THStorage.c:37

Do you have any advise for me?

margin loss wrong

Hello, I think your magin loss is wrong, according to the paper, after relu, there is a square you forgot

Softmax in routing algorithm incorrect?

Hi,
I think the softmax in the routing algorithm is being calculated over the wrong dimension.

Currently the code has:

b_ij = Variable(torch.zeros(1, self.num_routes, self.num_capsules, 1))
       ...
        for iteration in range(num_iterations):
            c_ij = F.softmax(b_ij)

and since the dim parameter is not passed to the F.softmax call it will choose dim=1 and compute the softmax over the self.num_routes dimension (input caps or 1152 here) whereas the softmax should be computed so that the c_ij between each input capsule and all the capsules in the next layer should sum to 1.

Thus the correct call should be:

           c_ij = F.softmax(b_ij, dim=2)

Class PrimaryCaps initialization parameter wrong?

class PrimaryCaps(nn.Module): def __init__(self, num_capsules=8, in_channels=256, out_channels=32, kernel_size=9):

Per paper: "The second layer (PrimaryCapsules) is a convolutional capsule layer with 32 channels of convolutional
8D capsules (i.e. each primary capsule contains 8 convolutional units with a 9 × 9 kernel and a stride
of 2)"

Which indicate num_capsules should be 32 and out_channels should be 8.

weight are nan

hello,I copy the code in the pycharm,but I found that the weight are 'nan',why?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.