Git Product home page Git Product logo

spark-ml-streaming's Introduction

Join the chat at https://gitter.im/freeman-lab/spark-ml-streaming

Visualize streaming machine learning in Spark

two-dimensional-demo one-dimensional-demo

About

This Python app generates data, analyzes it in Spark Streaming, and visualizes the results with Lightning. The analyses use streaming machine learning algorithms included with Spark as of version 1.2. The demos are designed for local use, but the same algorithms can run at scale on a cluster with millions of records.

How to use

To run these demos, you need:

  • A working installation of Spark
  • A running Lightning server
  • An installation of Python with standard scientific computing libraries (NumPy, SciPy, ScikitLearn)

With those three things in place, install using:

pip install spark-ml-streaming

Then set SPARK_HOME to your Spark installation, and start an executable:

streaming-kmeans -l <lighting_host>

Where lightning_host is the address of your Lightning server. After it starts, your browser will open, and you should see data appear shortly.

Try running with different settings, for example, to run a 1-d version with 4 clusters and a half-life of 10 points:

streaming-kmeans -p <temporary_path> -l <lighting_host> -nc 4 -nd 1 -hl 10 -tu points

Where temporary_path is where data will be written / read, if not specified the current tmp directory will be used (See Python tempfile.gettempdir())

2D data will make a scatter plot and 1D data will make a line plot. You can set this with -nd.

To see all options type:

streaming-kmeans -h

Build

The demo relies on a Scala package included pre-built inside python/mlstreaming/lib. To rebuild it, use sbt:

cd scala
sbt package

spark-ml-streaming's People

Contributors

felixcheung avatar freeman-lab avatar gitter-badger avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.