Market Basket Analysis is techniques used by large retailers to uncover associations between items. It works by looking for combinations of items that occur together frequently in transactions. To put it another way, it allows retailers to identify relationships between the items that people buy.
Kind of market basket analysis is Apriori algorithm is an algorithm for frequent item set mining and association rule learning over transactional databases. It proceeds by identifying the frequent individual items in the database and extending them to larger and larger item sets as long as those item sets appear sufficiently often in the database.
A recommender system is a simple algorithm whose aim is to provide the most relevant information to a user by discovering patterns in a dataset. The algorithm rates the items and shows the user the items that they would rate highly. An example of recommendation in action is when you visit Amazon and you notice that some items are being recommended to you or when Netflix recommends certain movies to you. They are also used by Music streaming applications such as Spotify and Deezer to recommend music that you might like.
Below is a very simple illustration of how recommender systems work in the context of an e-commerce site.
In this kernel will give some techniques to recommendation engines in some study case. There are three main categories of recommendation system which is practice in this kernel:
Content-based filtering, also referred to as cognitive filtering, recommends items based on a comparison between the content of the items and a user profile. The content of each item is represented as a set of descriptors or terms, typically the words that occur in a document. The user profile is represented with the same terms and built up by analyzing the content of items which have been seen by the user.
Collaborative Filtering is the process of information filtering by collecting human judgments (ratings) “word of mouth”. Collaborative filtering (CF) is a technique commonly used to build personalized recommendations on the Web.
Hybrid Filtering Algorithm is a combination between content based filtering and collaborative filtering.
Technologies Used
- Libraries: numpy, pandas, matplotlib, math, statsModels, scipy, nltk, ast, sklearn, surprise, pyodbc
- Web scraping: bs4, requests
- Notepad, OpenOffice
- Jupyter Notebook
The table below lists the recommender algorithms currently available in the repository.
Type | Algorithm | Dataset |
---|---|---|
Market Basket Analysis | Apriori | GroceryStoreDataSet |
Content Based | Cosine Similarity | Web Scraping |
Collaborative | Singular Value Decomposition (SVD) | rating_small.csv |
Hybrid | 0 | |
A/B testing | Z-test | ab-test kaggle |
Story data in this kernel is describe in each practice case to make a recommendation engine for every recommender algorithm.