Git Product home page Git Product logo

cuda.tsne's Introduction

Cuda Implementation of t-SNE

Build Status

Introduction

Disclaimer : This project is a work in progress and is not yet stable.

Wrapper for a CUDA implementation of Barnes-Hut t-Distributed Stochastic Neighbor Embedding (t-SNE). t-SNE is a technique for dimensionality reduction that is particularly well suited for the visualization of high-dimensional datasets.

This package exposes a CUDA extension of the C++ t-SNE with barnes-hut algorithm written by Laurens van der Maaten. The CUDA extension has been developed by George Dimitriadis.

Barnes-Hut t-SNE algorithm is done in two parts.

  • First part: a data structure for nearest neighbours search is built and used to compute probabilities. This can be done in parallel for each point in the dataset. This is the most time consuming part and this is where GPU can be used to speed-up the execution time. This part calls Python with Accelerate to handle the CUDA interfacing.

  • Second part: the embedding is optimized using gradient descent. This part is performed in C++ (with only one core).

Benchmark

The Street View House Numbers (SVHN) Dataset has been used for the benchmark. The dataset contains 73,257 images (32x32 RGB) obtained from house numbers in Google Street View images. A convolutional network has been used to extract input features for the t-SNE algorithm. The code used can be found here.

The benchmark compares the Rtsne package which wraps the original C++ implementation of BH t-SNE and the cuda.tsne package.

The following machines have been used for the benchmark:

  • p2.xlarge with Intel Xeon E5-2686 v4 (Broadwell) processor and NVIDIA K80 GPU (2,496 parallel processing cores and 12GiB of GPU memory).

  • m4.large with Intel Xeon E5-2686 v4 (Broadwell).

Step Rtsne cuda.tsne
Building tree 3370 sec 237 sec
Learning embedding 1315 sec 1604 sec
Total 4685 sec 1841 sec

The following parameters used: number of dimensions=2, perplexity=50, theta=0.5, eta=200, exageration=12 and iterations=1000.

Installation

Ubuntu 16.04

The machine must have CUDA 8 installed along with the Nvidia drivers.

Download and launch the conda installer. Please note that version 3 needs to be used.

$ wget https://repo.continuum.io/archive/Anaconda3-5.0.1-MacOSX-x86_64.sh
$ bash Anaconda3-5.0.1-MacOSX-x86_64.sh

Create a virtual environment.

$ conda create --name my_env python=3
$ source activate my_env

Install Accelerate.

$ conda install accelerate
$ R -e "devtools::install_github('edoffagne/cuda.tsne')"

References

  • Maaten, L. Van Der, 2014. Accelerating t-SNE using Tree-Based Algorithms. Journal of Machine Learning Research, 15, p.3221-3245.

  • van der Maaten, L.J.P. & Hinton, G.E., 2008. Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research, 9, pp.2579-2605.

cuda.tsne's People

Contributors

edoffagne avatar

Watchers

 avatar

Forkers

hoardboard

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.