Git Product home page Git Product logo

spark-recommender's Introduction

Spark Recommender

Scalable recommendation system written in Scala using the Apache Spark framework.

Implemented algorithms include:

  • k-nearest neighbors
  • k-nearest neighbors with clustering
  • k-nearest neighbors with a cluster tree
  • Alternating Least Squares (ALS) from Spark's MLlib

This first version was created during the eClub Summer Camp 2014 at Czech Technical University.
See the results of a benchmark and documentation in reportAndDocumentation.pdf

Build

Spark Recommender is built with Simple Build Tool (SBT). Run command:

sbt assembly

It creates the jar file in directory target/scala-2.10/.

Run

The application can be run using the spark-submit script.

cd target/scala-2.10/

‘$SPARK_HOME‘/bin/spark-submit --master local --driver-memory 2G --executor-memory 6G SparkRecommender-assembly-0.1.jar --class Boot (+ parameters of the recommender)

See documentation of Spark for information about parameters of spark-submit.

Parameters of the recommender

  • Setting up API

    • --interface <arg> Interface for setting up API (default = localhost)
    • --port <arg> Port of interface for setting up API (default = 8080)
  • Setting the dataset

    • --data <arg> Type of dataset
    • --dir <arg> Directory containing files of dataset

    Supported datasets: movieLens, netflix, netflixInManyFiles

  • Setting the algorithm

    • --method <arg> Algorithm
    • -pkey=value \[key=value\]... Parameters for algorithm

    Provided algorithms: kNN, kMeansClusteredKnn, clusterTreeKnn, als

  • Other

    • --products <arg> Maximal number of recommended products (default = 10)
    • --help Shows help
    • --version Shows version

See the documentation for parameters of a particular algorithm.

Example

‘$SPARK_HOME‘/bin/spark-submit --master local --driver-memory 2G \
--executor-memory 6G SparkRecommender-assembly-0.1.jar --class Boot\
--data movieLens --dir /mnt/share/movieLens/ \
--method kNN -p numberOfNeighbors=5

For simplification there's example-run script which sets some defaults. When running with netflix datasets it expects to have following files located in --dir:

  • ratings.txt
  • movie_titles.txt
./example-run --data netflix --dir /mnt/share/datasets/netflix \
 --method kNN -p numberOfNeighbors=5 --port 9090

API

Request

API supports two operations:

  • Recommend from user ID

      host:port/recommend/fromuserid/?id=<userID, Int>
    

    Example:

      http://localhost:8080/recommend/fromuserid/?id=97
    
  • Recommend from ratings

       host:port/recommend/fromratings/?rating=<productID, Int>,<rating, Double>
    

    Example:

       http://localhost:8080/recommend/fromratings/?rating=98,4&rating=176,5&rating=616,5
    

Response

The API returns the recommended products in form of JSON objects.

The JSON object for one recommendation looks like this:

{
    "product" : productID
    "rating" : Prediction of rating for this product
    "name" : "Name of product"
}

Example recommendation of three products:

{"recommendations":[
    {"product":312,"rating":5.0,"name":"High Fidelity (2000)"},
    {"product":494,"rating":5.0,"name":"Monty Python's The Meaning of Life: Special Edition (1983)"},
    {"product":516,"rating":4.0,"name":"Monsoon Wedding (2001)"}
]}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.