Git Product home page Git Product logo

movielens-capstone's Introduction

MovieLens Capstone Project

Completed for the Harvard Data Science Professional Certificate

Repo Contents

This repository includes the following items:

  • R Script
  • An Rmarkdown version
  • Knitted PDF version of .rmd file

General Info

The MovieLens recommendation site was launched in 1997 by GroupLens Research (which is part of the University of Minnesota). Today, the MovieLens database is widely used for research and education purposes. In total, there are approximately 11 million ratings and 8,500 movies. Each movie is rated by a user on a scale from 1/2 star up to 5 stars.

The goal of this project is to train a machine learning algorithm using the inputs in one subset to predict movie ratings in the validation set. The key steps that are performed include:

  • Perform exploratory analysis on the data set in order to identify valuable variables

  • Generate a naive model to define a baseline RMSE and reference point for additional methods

  • Generate linear models using average movie ratings (movie effects) and average user ratings (user effects)

  • Utilize matrix factorization to achieve an RMSE below the desired threshold

  • Present results and report conclusions

Results

Linear models did not provide sufficiently low RMSEs, however matrix factorization did achieve the desrired RMSE threshold of < 0.86490. The final test set RMSE was 0.83461 and the final validation set RMSE was 0.83396. Results are presented by method in the summary table below.

Final Table

Additional Notes

***Certificate info can be found here

movielens-capstone's People

Contributors

edenaxe avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.