- Status: Completed
In this Repository, I compare different recommendation systems and assess which ones are better depending on the scenarios. Recommendation Systems (RS) are suggestions for various decision-making processes, such as what product to purchase, what music to listen to, or what online news to read. Recommender systems are particularly useful when an individual needs to choose an item from a potentially overwhelming number of items that a service may offer. There are many types of RS:
- Content-Based Filtering: Recommendations based on product attributes and their similarities;
- Collaborative Filtering (CF): Uses 'wisdom of the crowd' to match recommendations to users;
- Memory-based CF: Relies on historical data to fit recommendations
- User-based filtering: Based on users with similar tastes.
- Item-based filtering: Based on items liked by similar users.
- Model-based CF: Finds underlying patterns inside the data to predict best recommendations.
- Memory-based CF: Relies on historical data to fit recommendations
The objective of this project is to create a recommender system to suggest movies for a user. To do so, we relied on an extensive movies dataset from the University of California Irvine (UCI), and created recommender systems of three approaches: Content-based Filtering, Collaborative Filtering (model-based), and Collaborative Filtering (memory-based)
- Importing Data and libraries;
- Explanatory Data Analysis;
- Development and test of Content-based recommender system;
- Development of collaborative Filtering Systems;
List of technologies:
To conclude, we can assure that every Recommending System has its pros and couns, but they clearly can suit better accordingly with the problem and data available. Collaborative Filtering usually performs better, but requires higher computational capacity (and sometimes more data). In this case, the model-based with SVD matrix factorization had a better performance in terms of RMSE than the Memory-based with cosine similarity. The content-based is simpler, though it cannot be measured and is less likely to overcome the former ones.