Git Product home page Git Product logo

tsne_visual's Introduction

tsne_visual (version 0.2) - for OSX and Linux

A Python3 package for running, visualizing and producing animations of t-distributed stochastic Nearest-Neighbor Embedding (t-SNE) implemented in C++. The code is a modified version of bhtsne taken from Laurens van der Maaten repository. The package implements t-SNE as a Class following the sklearn syntax.

Installing and running (in a few steps) :

  • Clone or download this repository
  • Open file
  • Compile C++ code and install the package with the following commands:

Installing

I suggest you install the code using pip from an Anaconda Python 3 environment. From that environment:

git clone https://github.com/alexandreday/tsne_visual.git
cd tsne_visual
g++ cpp/sptree.cpp cpp/tsne.cpp -o tsne_visual/bh_tsne -O2
pip install .

That's it, you're good to go ! You can now import tsne_visual from anywhere. See the following example for a quick start.

Example script for MNIST:

For an example look at example/example.py. This is an example of t-SNE applied to the MNIST data set (provided in example/MNIST/). The syntax used is very similar to sklearn syntax. It should a produce a figure similar to this: alt tag

Requirements:

  • g++ compiler (for the C++ code)
  • Python3.x
  • ffmpeg software (optional - for producing animations)
  • scikit-learn package

Why this package ?

During the course of a research project I ended up using t-SNE quite a bit for large datasets (N>20000). I wanted something easy to use (i.e. in written in python, and that gave me easy access to all of t-SNE parameters) but also very fast (i.e. with C/C++ speed). I also wanted to produce animations of the t-SNE as a function of the iterations. To achieve all this I ended combining codes from multiple sources and writing a bit of code myself. I thought this might be useful for other people too.

Some useful references:

Uninstalling !

pip3 uninstall tsne_visual

tsne_visual's People

Contributors

alexandreday avatar

Stargazers

 avatar Daniel Ji avatar Ruiheng Chang avatar QC avatar  avatar Ethan Lu avatar

Watchers

James Cloos avatar  avatar

tsne_visual's Issues

bool index error happens

(base) C:\Users\ad204\OneDrive\Documents\tsne_visual\example>python example.py
--> Running t-SNE on MNIST (with n=10000), then plotting the result
--> First doing some PCA (n_components=40) to clear irrelevant dimensions
-----------> Starting t-SNE <------------
DONE
Gathering data from: Read the 10000 x 40 data matrix successfully!
Using no_dims = 2, perplexity = 30.000000, theta = 0.500000, n_iter = 1000
Computing input similarities...
switch is wrong - ".tmp_dim".
C:\Users\ad204\OneDrive\Documents\tsne_visual\example\KL_score_dim not found.
C:\Users\ad204\OneDrive\Documents\tsne_visual\example\tSNE_dim not found.
Finally let's plot the results
Traceback (most recent call last):
File "example.py", line 36, in
plt.scatter(xtsne[pos,0], xtsne[pos,1],
IndexError: boolean index did not match indexed array along dimension 0; dimension is 10035 but corresponding boolean dimension is 10000

is it related to numpy version?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.