Section Recap
Introduction
In this section, you learned about clustering. Here are the key takeaways from this section.
Objectives
You will be able to:
- Understand and explain what was covered in this section
- Understand and explain why this section will help you become a data scientist
Key Takeaways
The key takeaways from this section include:
- There are two main types of clustering algorithms: non-hierarchical (K-Means) clustering, and hierarchical agglomerative clustering
- You can quantify the performance of a clustering algorithm using metrics such as Variance Ratios
- When working with the K-Means clustering algorithm, it is useful to create elbow plots to find an optimal value for K
- When using hierarchical agglomerative clustering, different linkage criteria can be used to determine which clusters should be merged and at what point
- Dendrograms and clustergrams are very useful visual tools in hierarchical agglomerative clustering.
- Advantages of K-Means clustering include easy implementation and speed, where a main disadvantage is that it isn't always straightforward how to pick the "right" value for K
- Advantages of hierarchical agglomerative clustering include easy visualization and intuitiveness, where a main disadvantage is that the result is very distance metric-dependent
- You can use supervised and unsupervised learning together to co-use them in an effective way, applications are Look-alike models in market segmentation and Semi-Supervised learning