higgsfield / capsule-network-tutorial Goto Github PK

View Code? Open in Web Editor NEW

757.0 757.0 133.0 12 KB

Pytorch easy-to-follow Capsule Network tutorial

Jupyter Notebook 100.00%

capsule-network clean-code easy-to-use pytorch pytorch-tutorials

capsule-network-tutorial's Introduction

Capsule-Network-Tutorial

This is easy-to-follow Capsule Network tutorial with clean readable code: Capsule Network.ipynb

Dynamic Routing Between Capsules

Understanding Hinton’s Capsule Networks blog posts:

Part I: Intuition.

Part II: How Capsules Work.

Part III: Dynamic Routing Between Capsules.

capsule-network-tutorial's People

Contributors

Stargazers

Watchers

Forkers

rouseguy nimmen lulzzz shethr heapycob yintaichen dantodor mark-watson jalamao nieshaoshuai hbcbh1999 hephaex locosoft1986 lckfork rn-unison angzz wpf535236337 tfolkman linpingchuan yuechengyin scouly pjha1994 shubhampachori12110095 linhanxiao gavins13 dymat tejas163 hieuqtran rafaelmd drah-kah-ris danlanchen southatsouth daijucug boazrciasn codeaudit damienstanton tuzhuo96 himalayajung chenghuang-ch codes-kzhan sagarchaturvedi1 albert0147 grseb9s yhz5256 dreamstudio2015 allenwoods wh-forker fantasyqianqiang ansuini cassandra-t-oduola haiminzhang michiboo1 sriharsha0806 afcarl koldus xiaoliang008 sakhawatsumit jlee12393 mdalemran15 kfzyqin smartjennings chenlei76 macos khurramhazen pku-wuwei yucoian royricheek jiajiemo kunalmessi10 achennu frankey419 riretta batermj chensab2 fran0k raman1121 jkingf hiterstudy censomin tonyle9 ketankr9 linhduongtuan arita37 imadtyx shihaocui taolian cccccao wuyohee2004 zhang-lingyun xiangs18 jaydeepdeka zzhengyang abhimanyuaryan hongruzhong liguihong hehaoming pepsalehi donglinz tli347 yangyuwu

capsule-network-tutorial's Issues

test accuracy

I don't know why the train accuracy is becoming lower and lower with the epoch decreasing. Besides, the margin loss and reconstruction loss can't be caculated.

Doesn't appear to be training, even after applying every fix in the issues (open+closed)

Please take more care of your repository.

Plenty of PRs unmerged. Suggestions unaddressed. It's not clear which ones to implement and which ones to ignore.

Initial code doesn't run.

After tweaking:

train accuracy: 0.07
train accuracy: 0.17
train accuracy: 0.14
train accuracy: 0.07
train accuracy: 0.07

I don't see how you get the results your notebook claims.

This has proven to be a headache and a frustration. Clearly so much effort and intelligence was put into the construction, why not put in the extra 1% and leave behind a valuable/tidy/clean resource?

Awesome repo. I've been trying to implement an MLP based caps net, where the primary caps analyzes subsets of features in groups, feeding them into MLPs per capsule then pushed to downstream capsules.

For some reason, I am getting terrible results. Do you know what I might be doing wrong?

What are areas that make MLP capsules different from convolutional primary capsules, what should I be aware of? How do I optimize to get good results. I think dynamic routing is essential for establishment of hierarchy, but in the non-image context, I'm unsure of how the affine transformation is helpful.

Can you post code on how one might do this using MLP instead of convolutional layers?

RuntimeError:index_select() and issue about DataParallel

First of all, thanks, its definitely an easy to follow CapsNet tutorial for me as a beginner, but I found an error after running the code:

RuntimeError: index_select(): argument 'index' must be Variable, not torch.cuda.LongTensor

I solved this issue same as gram-ai/capsule-networks#13, in Decoder class :

 masked = masked.index_select(dim=0, index=max_length_indices.squeeze(1).data)

".data" should be removed.

Then I successfully trained on single GPU according to this tutorial, but when I tried to train the net on two GPUs according to PyTorch data parallelism tutorial :

if USE_CUDA:
      print("Let's use %d GPUs" % torch.cuda.device_count())
      capsule_net = nn.DataParallel(capsule_net).cuda()

but it produced an error
AttributeError: 'DataParallel' object has no attribute 'loss'

I'm confused, and if there is any good solution, please tell me, thanks!

(I use python 2.7.12 and pytorch 0.3.0.post4)

latest version of pytorch requires index to be Variable in index_select()

I think
index=..... should be wrapped in as index=Variable(.....) in Decoder class as:
masked = masked.index_select(dim=0, index=Variable(max_length_indices.squeeze(1).data))

I can not make it work with CIFAR10

I wanted to try working on CIFAR10, I modified the channel number =3 in convLayer e the kernel dim = 24 in convLayer. But it is not working in the primaryCaps u = u.view(x.size(0), 32 * 6 * 6, -1) that is giving to me errot :

ipython-input-4-7b4e2b87bd5c> in forward(self, x)
14 print( "PrimaryCaps {}".format(x.size()))
15 #u = u.view(x.size(0), 32 * 6 * 6, -1)
---> 16 u = u.view(x.size(0), 32 * 6 * 6, -1)
17 return self.squash(u)
18
RuntimeError: invalid argument 2: size '[100 x 1152 x -1]' is invalid for input with 204800 elements at /pytorch/aten/src/TH/THStorage.c:37

Do you have any advise for me?

margin loss wrong

Hello, I think your magin loss is wrong, according to the paper, after relu, there is a square you forgot

Wrong stack when cat the capsule

Stack should not be implemeted on dim 1，but dim 4. Otherwise, the view result will be wrong.

TRY MY IMPLEMENTATION

Similar to others, I found many issues with this implementation (a lot of mistakes!!!). So I decided to create my own one. It is bug-free and works very good. You can find it here:

https://github.com/hula-ai/capsule_network_dynamic_routing

Softmax in routing algorithm incorrect?

Hi,
I think the softmax in the routing algorithm is being calculated over the wrong dimension.

Currently the code has:

b_ij = Variable(torch.zeros(1, self.num_routes, self.num_capsules, 1))
       ...
        for iteration in range(num_iterations):
            c_ij = F.softmax(b_ij)

and since the dim parameter is not passed to the F.softmax call it will choose dim=1 and compute the softmax over the self.num_routes dimension (input caps or 1152 here) whereas the softmax should be computed so that the c_ij between each input capsule and all the capsules in the next layer should sum to 1.

Thus the correct call should be:

           c_ij = F.softmax(b_ij, dim=2)

Class PrimaryCaps initialization parameter wrong?

class PrimaryCaps(nn.Module): def __init__(self, num_capsules=8, in_channels=256, out_channels=32, kernel_size=9):

Per paper: "The second layer (PrimaryCapsules) is a convolutional capsule layer with 32 channels of convolutional
8D capsules (i.e. each primary capsule contains 8 convolutional units with a 9 × 9 kernel and a stride
of 2)"

Which indicate num_capsules should be 32 and out_channels should be 8.

weight are nan

hello,I copy the code in the pycharm,but I found that the weight are 'nan',why?