Git Product home page Git Product logo

movielens's Introduction

Movie Recommender System

movies

Today, I’m excited to share my journey of building a movie recommender system using the MovieLens dataset. In this project, I experimented with several techniques, including collaborative filtering, matrix factorization, and AutoRec. After rigorous testing and tweaking, I found that the deep learning implementation of matrix factorization delivered the best results. It outperformed the other methods by a significant margin.

Data

I used a smaller chunk of the MovieLens dataset available from TensorFlow datasets. This dataset is rich with user ratings for various movies, making it perfect for building a recommender system. To efficiently manage and utilize this data, I parsed it into three dictionaries:

  1. user2movie: Maps each user to the movies they have rated.
  2. movie2user: Maps each movie to the users who have rated it.
  3. usermovie2ratings: Stores the actual ratings that users have given to movies.

For the AutoRec model, I created a sparse representation of the user-movie matrix and the ratings matrix. This representation is crucial for AutoRec as it leverages neural networks to predict missing ratings. When it came to Matrix Factorization, the process was even more straightforward. Since I used Keras embeddings, I didn’t need to transform the data in any specific way. The embeddings handled the data representation internally, allowing for a more seamless integration into the model. With the data organized and prepped, I was ready to dive into the modeling phase.

movielens

Model Building and Training

Building and training the movie recommender system was an experiment in itself. I explored three different models: Collaborative Filtering, Matrix Factorization, and AutoRec. Each model brought its own challenges and learning experiences. Here's a detailed look at how each model performed.

Collaborative Filtering

User-based Collaborative Filtering: This model focused on finding similarities between users to predict ratings. It was relatively quick to train and produced decent results.

  • Train MSE: 0.6882
  • Test MSE: 1.0906

Item-based Collaborative Filtering: Contrary to expectations, this model took longer to train compared to the user-based approach and did not perform as well. Despite literature suggesting item-item models can be faster and more accurate, my results differed.

  • Train MSE: 2.2444
  • Test MSE: 4.4829

Matrix Factorization

Simple Matrix Factorization: Implemented using TensorFlow and Keras, this model was trained for 10 epochs with a batch size of 32. The model used Mean Squared Error (MSE) as the loss function and Adam as the optimizer. The training and validation metrics showed a steady decrease in loss and MSE initially, but started to plateau and slightly increase towards the end.

mse_2

Matrix Factorization with Deep Learning: This approach extended the matrix factorization model with deep learning techniques such as dense layers, dropout, batch normalization, and ReLU activation. Trained with Stochastic Gradient Descent (SGD) optimizer with a learning rate of 0.01 and momentum of 0.9, the model showed a balanced improvement over the epochs.

mse_3

Matrix Factorization with Residual Networks: Combining the simplicity of matrix factorization with the power of deep learning, I introduced residual connections to improve the learning capability. This hybrid model also used the SGD optimizer with similar parameters and showed promising results.

mse_4

AutoRec

For the AutoRec model, I used a sparse representation of the user-movie matrix. Building custom train and test data generators that utilized masking to focus the model on relevant data points was crucial. A custom loss function was implemented, and the model was trained with the SGD optimizer (learning rate of 0.08, momentum 0.9) for 10 epochs.

mse_5

Each model brought its own strengths and weaknesses, but the deep learning-enhanced matrix factorization models consistently performed well, striking a balance between accuracy and training time. This detailed experimentation provided valuable insights into the nuances of building effective recommender systems.

Conclusion

In the end, my goal was to build a robust movie recommender system. Through experimenting with various techniques—Collaborative Filtering, Matrix Factorization, and AutoRec—I gained valuable insights into their strengths and limitations. The results clearly indicated that Matrix Factorization, when enhanced with deep learning capabilities, serves as a superior approach for generating accurate and reliable movie recommendations. This method showed significant promise and efficiency, making it a strong candidate for deployment on a server for real-time recommendations. Given its performance, Matrix Factorization with deep learning should definitely be considered for real-world applications in movie recommendation systems.

movielens's People

Watchers

Hassan Revel avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.