Git Product home page Git Product logo

lsh_deeplearning's Introduction

Scalable and Sustainable Deep Learning via Randomized Hashing

Look for major updates around ICLR 2018 deadline this November!

Abstract

Current deep learning architectures are growing larger in order to learn from complex datasets. These architectures require giant matrix multiplication operations to train millions of parameters. Conversely, there is another growing trend to bring deep learning to low-power, embedded devices. The matrix operations, associated with the training and testing of deep networks, are very expensive from a computational and energy standpoint. We present a novel hashing-based technique to drastically reduce the amount of computation needed to train and test neural networks. Our approach combines two recent ideas, Adaptive Dropout and Randomized Hashing for Maximum Inner Product Search (MIPS), to select the nodes with the highest activation efficiently. Our new algorithm for deep learning reduces the overall computational cost of the forward and backward propagation steps by operating on significantly fewer nodes. As a consequence, our algorithm uses only 5% of the total multiplications, while keeping within 1% of the accuracy of the original model on average. A unique property of the proposed hashing-based back-propagation is that the updates are always sparse. Due to the sparse gradient updates, our algorithm is ideally suited for asynchronous, parallel training, leading to near-linear speedup, as the number of cores increases. We demonstrate the scalability and sustainability (energy efficiency) of our proposed algorithm via rigorous experimental evaluations on several datasets.

Future Work

  1. Implement a general LSH framework for GPUs with API support for TensorFlow, PyTorch, and MXNet
  2. Build Scalable One-Shot Learning using Locality-Sensitive Hashing [https://github.com/RUSH-LAB/LSH_Memory]
  3. Demonstrate Tera-Scale machine learning on a single machine using our algorithm, tailored for sparse, high-dimensional datasets (See Netflix VectorFlow)

References

  1. Scalable and Sustainable Deep Learning via Randomized Hashing (KDD 2017 - Oral)
  2. Efficient Class of LSH-Based Samplers
  3. Learning to Remember Rare Events
  4. Netflix VectorFlow

lsh_deeplearning's People

Contributors

jeff-shi avatar rdspring1 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lsh_deeplearning's Issues

Asymmetric LSH in the code?

Hello,

I was trying to understand your code after reading the paper. I notice that in the paper, you mention, asymmetric LSH functions are used(sec 5.1). However, in the code, when building the hash tables, random projection is used. when retrieving the nodes for the forward propagation, the hash values of data are used to do near neighbor search directly, to look for the "near" nodes(or called weights). I guess the following code does this job.

    public Set<Integer> histogramLSH(int[] hashes)
    {
        assert(hashes.length == m_L);

        Histogram hist = new Histogram();
        for (int idx = 0; idx < m_L; ++idx)
        {
            if (m_Tables.get(idx).containsKey(hashes[idx]))
            {
                hist.add(m_Tables.get(idx).get(hashes[idx]));
            }
        }
        return hist.thresholdSet(m_nn_sizeLimit);
    }

So, my question is, where Asymmetric LSH function has been used? in [38], mentioning that some parameters need to be set for Asymmetric LSH, like m=3, U=0.83, r=2.5. I can't find how these are used/set in this code.

Thanks,

Python Implementations?

I cam across this repo after your reading your paper. Are you aware of any python based implementations?

Batched Computation

Hi Ryan,

The LSH based forward and backward computation is a great idea and your paper has a nice contribution!

However, you have to sample from the LSH buckets (different weight vectors w_i) for each input because they might indicate different weight vectors. Is the implementation here batched? Do you have suggestions to how to implement your algorithm in a batched setting?

Allen

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.