Git Product home page Git Product logo

map_crowd_reduce's Introduction

Map Crowd Reduce

This project is a “SETI-at-home-like infrastructure for massively distributed cpu-intensive jobs based on HTML5 WebWorkers and node.js for distributing the tasks”. (quote from Pedro Teixeira’s blog).

This project made into the final SAPO Codebits 2010 programming contest, but unfortunately didn’t win :)

The code was built in an afternoon with the help of some friends. It was my first real-world JavaScript code, so there’s a lot to cleanup and improve :)

Quick Overview

The project consists of a node.js server that created a HTTP server on port 3000.

To start a new computation, you must open the main page, and supply 3 or 4 things:

A payload (optional)

You can supply a file payload that will be chuncked by the next function.

A split/segmenter JS function


  function s(input) {
    // optionally read the input and generate a list of segments to be mapped

    return [ ... ];
  }

This function will run on the server side, and will generate a list of jobs to map.

A map JS function


  function map(args) {
    // do heavy computation with args

    return result;
  }

This function will run on client side. Please generate as many as you want :)

A reduce JS function


  function reduce(arg) {
    // arg is [result, result, result, result]

    // reduce them :)
    return final_result;
  }

This function will run on the client side.

After you define the functions, the system gives you an URL that you can share with anyone wanting to contribute with CPU to the computation.

The client side functions are run on a WebWorker so it doesn’t block the main UI thread.

WebSockets are used for maintaining communication between server and clients.

Requirements / installation

On the client you need a modern browser (latest version of Firefox, Chrome and Webkit should work).

On the sever you will need latest node.js stable and npm, and the following node modules:

  • express
  • connect-form
  • uuid
  • haml
  • socket.io

Examples

With the code there are 2 prewritten examples:

  • find the number of primes between 0 and 60000 (brute force)

this uses a brute-force method of testing if each number is a prime, from 0 to 60000. it runs very fast and can be used to debug.

  • find a MD5 collision with the MD5 on all words with 5 or less alfa words

this lasts 15 minutes on a single machine. when I tested it on the main stage, 50 people connected their browsers at it lasted like 20 seconds (#win).

To run the examples you just open the root path of the web server and click on the button example. It will fill all the JS boxes with the prewritten code.

Limitations

The main limitation is how we handle and store the input and results. Right now, when you upload the payload, node.js loads it into memory to pass them to the s function.

Also, if your s function generates billion of jobs, the things could not work because all the data and arguments are kept in memory.

This also means that right now, if you close the server, you loose the current computation.

Security Considerations

There are two main security problems at the moment. First of all, the s function is run on the server side without any security check. I admit that I have no idea on if and how I could do that on node.js. Any hints (and patches) are appreciated :)

The second security problem is a feature :) At the end of the computation, the result of the reduce function is sent to all connected clients and injected in the DOM. This means that the reduce function can generate cool stuff like canvas with a beautiful fractal, but it also means that a malicious user can inject some nasty HTML on the client side.

Further work

As this was built as a quick hack, and by people with no experience on JS other that quick hacks on the browser, the code has a lot to improve. I’m specially looking into better modularization/encapsulation of the code.

As I’m writing this, some friends of mine are looking to build other cool examples, like building a distributed Mandelbrot fractal!

Also, I’m trying to get my feet wet with WebGL, because I think this could be improved by making some heavily number crunching on the client GPU :)

map_crowd_reduce's People

Contributors

microft avatar rubenfonseca avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.