Blogs/Clustering Intuition

Clustering Intuition

peterwashington Nov 01 2021 2 min read 0 views
Unsupervised
Clustering1.png

The intuition for clustering is simple. Let’s look at a few examples. Imagine you are a professor, and you have several student grades for an exam. You want a quantitative way to select letter grade cutoffs. You can use clustering to determine where groups of number grades are and use these to determine the letter grades. For example, let’s plot some student grades on a number line:

If you want 6 different letter grades for your students, you could input “6” into a clustering algorithm, and it might output the following boundaries:

A two-dimensional example: you’re a music streaming service, and you want to sort your users into different advertising groups based on the percentage of pop rock and percentage of pop rap that they listen to. A clustering algorithm may output the following:

(For the example above to mathematically work, we assume that there are other categories of music that users can listen to which we don’t consider when clustering.)

In the real world, there are many more than 2 input variables used to cluster the data points.