Git Product home page Git Product logo

lisht's Introduction

LiSHT

This repository contains a Keras implementation of the paper "LiSHT: Non-Parametric Linearly Scaled Hyperbolic Tangent Activation Function for Neural Networks" - (link to the arXiv page)

Roy, Kumar S., Manna, S., Dubey, Ram S., and Chaudhuri, Baran B., 2019. LiSHT: Non-Parametric Linearly Scaled 
Hyperbolic Tangent Activation Function for Neural Networks. arXiv preprint arXiv:1901.05894.

Description

The activation function in neural network is one of the important aspects which facilitates the deep training by introducing the non-linearity into the learning process. However, because of zero-hard rectification, some the of existing activations function such as ReLU and Swish miss to utilize the negative input values and may suffer from the dying gradient problem. Thus, it is important to look for a better activation function which is free from such problems. As a remedy, this paper proposes a new non-parametric function, called Linearly Scaled Hyperbolic Tangent (LiSHT) for Neural Networks (NNs). The proposed LiSHT activation function is an attempt to scale the non-linear Hyperbolic Tangent (Tanh) function by a linear function and tackle the dying gradient problem.

What is a LiSHT?

Most neural networks work by interleaving linear projections and simple (fixed) activation functions, like the ReLU function:

A LiSHT is instead a parametric activation function defined as non-parametric approximator:

Pre-Activation ResNet

Experimental Data

The effectiveness of the proposed LiSHT activation function is evaluated on six benchmark datasets:

Results

The classification performance of MLP for different activations over Cars Evaluation, Iris and MNIST datasets.

Dataset Activation
Function
Training Validation
Loss Accuracy Loss Accuracy
Car Eval. Tanh 0.0341 98.84 0.0989 96.40
Sigmoid 0.0253 98.77 0.1110 96.24
ReLU 0.0285 99.10 0.0769 97.40
Swish 0.0270 99.13 0.0790 97.11
LiSHT 0.0250 99.28 0.0663 97.98
Iris Tanh 0.0937 97.46 0.0898 96.26
Sigmoid 0.0951 97.83 0.0913 96.23
ReLU 0.0983 98.33 0.0886 96.41
Swish 0.0953 98.50 0.0994 96.34
LiSHT 0.0926 98.67 0.0862 97.33
Tanh 1.1534 58.86 1.3759 51.74
Sigmoid 1.1319 59.51 1.3693 52.12
ReLU 1.1776 57.49 1.3731 51.85
Swish 1.1468 58.65 1.3705 51.83
LiSTh 1.1216 59.13 1.3661 52.16
MNIST Tanh 0.0138 99.56 0.0987 98.26
Sigmoid 0.0064 99.60 0.0928 98.43
ReLU 0.0192 99.51 0.1040 98.48
Swish 0.0159 99.58 0.1048 98.45
LiSHT 0.0127 99.68 0.0915 98.60

The classification performance of ResNet for different activations over MNIST and CIFAR-10/100 datasets.

Dataset ResNet
Depth
Activation Functions
Tanh ReLU Swish LiSHT
MNIST 20 99.48 99.56 99.53 99.59
CIFAR-10 164 89.74 91.15 91.60 92.92
CIFAR-100 164 68.80 72.84 74.45 75.32

The classification performance of LSTM for different activations over twitter140 dataset.

Dataset Activation Functions
Tanh ReLU Swish LiSHT
Twitter140 82.27 82.47 82.22 82.47

Citation

If you use this code or a derivative thereof in your research, we would appreciate a citation to the original paper:

@article{roy2019lisht,
        title={LiSHT: Non-Parametric Linearly Scaled Hyperbolic Tangent Activation Function for Neural Networks},
        author={Roy, Swalpa Kumar and Manna, Suvojit and Dubey, Shiv Ram and Chaudhuri, Bidyut B},
        journal={arXiv preprint arXiv:1901.05894},
        year={2019}
    }

License

The code is released under the MIT License. See the attached LICENSE file.

lisht's People

Contributors

swalpa avatar gokriznastic avatar suvojit-0x55aa avatar

Watchers

James Cloos avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.