Git Product home page Git Product logo

capsnet's Introduction

CapsNet or Capsule Networks

Contributions welcome License

Sabour et al. 2017: "A capsule is a group of neurons whose activity vector represents the instantiation parameters of a specific type of entity such as an object or object part."

Capsule Networks are made of these capsules that have demonstrated the state-of-the-art performance for MNIST character recognition dataset. They are an improvement to the convolutional neural network because of the two following reasons

  • They understand the spatial relationships between the set of features whereas in the case of convoutional neural networks while subsampling/ pooling this information is lost.
  • They are more robust to the geometric transformations of the learned features such as rotation, translation etc. Hence these are more generalisable.

Capsule Networks have many layers nested inside a layer that is called a capsule. Capsule networks are different from Convolutional neural networks (1) In the manner that they are connected i.e. architecture of the network and (2) the way the information is routed through the entire network. Both of these have been briefly described below

Squashing - Architecture of the network and their activation

In the case of CNN we make use of activation functions such as ReLU (Rectified linear unit), tanH (Hyperbolic Tanget), Sigmoid function etc. to decide which of the layers are important to be "activated. That is, we apply these activation functions to each layer of the CNN to determine that for a particular feature in the input image whether this layer has an important contribution to the next layer or not. As the layers become deeper, it becomes computationally expensive to keep all the information about each layer in the network. Thus, an activation function that can help to keep only the main contributors is desirable. In these cases, use of activation function such as ReLU has been shown to be more efficient then tanH or sigmoid function as the former is computationally less expensive.

In the case of CapsNet, as a number of layers are clubbed together into capsule. So in this case, instead of applying a non-linear function to each layer inside the capsule, a "squashing" function is applied to the output of the capsule.

Routing:

In the case of CNN, maxpooling determines the flow of information between the layers. As discussed above, pooling leads to loss of information about the spatial relationship between the set of features in an image as the output of the max pool layers is a scalar quantity.

In the case of CapsNet, the output of each capsule is a vector quantity. Hence, this preserves the spatial relationship between the features in an image. A new routing mechnanism has been proposed to handle the flow of information between the capsules. The information from each capsule is routed to the next most similar capsule. A simpler analogy is hierarchical tree of the layers.

Links for CapsNet implementation

Links for the some slides and papers

Link for descriptive websites

Links for discussion forums

Links for informative videos

Links for CapsNet applications

capsnet's People

Contributors

gurpreethgnis avatar gtm2122 avatar

Stargazers

 avatar Ko Dae Won avatar Jianyi Wang avatar Raman Dutt avatar jklee avatar  avatar

Watchers

James Cloos avatar Mohit Pandey avatar  avatar  avatar paper2code - bot avatar

Forkers

davidko3

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.