Blogs/Principal Component Analysis (PCA) Intuition

Principal Component Analysis (PCA) Intuition

peterwashington Nov 02 2021 3 min read 0 views
Dimensionality Reduction
PCA5.png

The simplest method of dimensionality reduction is called Principal Component Analysis, or PCA. The intuition behind how it works is straightforward.
Let’s say we want to apply PCA to this dataset:

The goal of PCA in this case is to visualize the data points in 1 dimension instead of 2. Conceptually, the way PCA does this is by finding a new axis that maximizes the variance of the the data along that axis. In the dataset above, this would be this axis new axis (called principal component 1):

To see the data plotted in principal component space, we project the points onto the principal component axis, and then plot those points on the axis. This is easier to visualize than explain. We first draw lines, which are perpendicular to the principal component axis, from each data point to the principal component line:

We then move the points onto the line:

We can rotate the line to see the visualization in 1D principal component space in a standard way:

The math behind PCA does exactly what we visualized above, but in a mathematical way. To understand this math, it is crucial to understand linear algebra. We will cover the math behind PCA in a separate blog post.