Git Product home page Git Product logo

sparrowrecsys's Introduction

SparrowRecSys

SparrowRecSys is a movie recommendation system, named SparrowRecSys (Sparrow Recommendation System), which means "a sparrow is small but has all the internal organs". The project is a mixed language project based on maven, which also includes different modules of recommendation systems such as TensorFlow, Spark, and Jetty Server.

environment

  • Java 8
  • Scala 2.11
  • Python 3.6+
  • TensorFlow 2.0+

project data

The project data comes from the open source movie data set MovieLens๏ผŒ The project's own data set has been streamlined from the MovieLens data set, retaining only 1,000 movies and related comments and user data. Please go to MovieLens official website to download the full dataset. It is recommended to use MovieLens 20M Dataset.

SparrowRecSys technology

SparrowRecSys technical architecture follows the classic industrial-grade deep learning recommendation system architecture, including multiple modules such as offline data processing, model training, near-line stream processing, online model services, and front-end recommendation result display. The following is the architecture diagram of SparrowRecSys:

  • It is divided into three main sections: data processing, model part, and frontend part.

  • Data Processing Section

  • User Information : User data includes user actions, social relationships, and attribute tags.

  • Item Information : Item data includes item attributes, tags, and third-party information.

  • Context Information : Contextual data includes time, location, and other contextual parameters.

  • Data Processing Platforms:

  • Flink: Used for real-time data processing.

  • Spark: Used for offline data processing.

  • Redis: Used for storing user, item, and context features.

  • Feature Engineering:

  • User Features: User actions, social relationships, attribute tags.

  • Item Features: Item attributes, tags, third-party information.

  • Context Features: Time, location, and other contextual parameters.

  • Techniques: Normalization, binarization, non-linear transformations, ID features, one-hot encoding, embedding, feature combination.

  • Model Part

  • Recommendation System Model and Online Serving:

  • Cold Start Strategy :

  • Recall Layer : Embedding, collaborative filtering, multi-dimensional tags, social relationships, freshness update.

  • Ranking Layer : Temporal and sequential models, LR (Logistic Regression), FM (Factorization Machines), MLR (Multivariate Linear Regression), deep learning models.

  • Filling Strategy Algorithm : Diversity, novelty, hotness, flow control, freshness.

  • Exploration and Utilization : Interaction with candidate item database.

  • Model Serving:

  • MLeap: Model deployment.

  • TensorFlow Serving: Model serving.

  • Model Training:

  • Platforms: Spark MLlib, TensorFlow.

  • Offline evaluation: Metrics include AUC, Recall, RMSE.

  • Frontend Part

  • Implementation: Based on HTML and JavaScript with AJAX functionalities.

  • Recommendation Item List : Display of recommended items.

SparrowRecSys Implemented deep learning model

  • Word2vec (Item2vec)
  • DeepWalk (Random Walk based Graph Embedding)
  • Embedding MLP
  • Wide&Deep
  • Nerual CF
  • Two Towers
  • DeepFM
  • DIN(Deep Interest Network)

Related paper

Related resources

sparrowrecsys's People

Contributors

birdviewhome avatar wzhe06 avatar zcxia23 avatar sunningilya7 avatar bfgf52 avatar gekfreeman avatar yiksanchan avatar dependabot[bot] avatar sniperdarksider avatar happyvictorwu avatar zhengjxu avatar v-wx-v avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.