Git Product home page Git Product logo

Comments (4)

InnovArul avatar InnovArul commented on August 17, 2024

I had the same question.

the dimension of logits is [10, 100, 1152, 1, 16].
i.e., [num_digits, batch_size, num_prevlayer_capsules, 1, digits_features_dim]

since the softmax is taken over dim=2, the softmax works on 'number of primary capsules = 1152'.

The explanation seems to me as below:

Every digit capsule (=10) should select which of the 1152 primary capsules to be accepted for its decision.
But in the paper and other places (youtube, blogs etc), it has been told that each of 1152 primary capsules will decide to which of the 10 digit capsules it will send its output to. Seems there is a misunderstanding.

In simple words, each of the 1152 lower level capsules will be fighting to send its output to 10 digit capsules. Hence, the softmax seems to be on dim=2.

from capsule-networks.

CoderHHX avatar CoderHHX commented on August 17, 2024

I have the same question and I read some other implementation like Tensorflow and PyTorch for CapsuleNet and I think that softmax logits [10, 100, 1152, 1, 16] should apply to dim 0. probs = softmax ( logits, dim=0 ) as the original paper presents.

from capsule-networks.

InnovArul avatar InnovArul commented on August 17, 2024

I think, below are the explanations for softmax along dim=0/dim=2:

  1. softmax along dim=0:
    Each primary capsule (= 1152) decides how much information it passes to each of the digit capsules (=10). (According to paper)

  2. softmax along dim=2:
    Each digit capsule (= 10) chooses how much information it takes from each of the primary capsules (=1152). (According to this implementation)

Since 1 & 2 gives the same performance (more or less), I am not sure how to reason it.
@CoderHHX Do you have any intuition?

from capsule-networks.

CoderHHX avatar CoderHHX commented on August 17, 2024

@InnovArul Thanks for your reply! In my opinion, if we want to follow the original paper, we should set the dim equal 0. And as you say that with the dim equal 2, the model can achieve similar performance with the original one. I think this may be caused by the equivalent effect that routing the weights based on PrimayCaps or DigCaps. Both of these ways can achieve capsule transformation.

from capsule-networks.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.