Embedding

6.3.2. Embedding#

Modern datasets rarely come neatly packaged with just a few features. In practice, we often work with tens, hundreds, or even thousands of dimensions. While more features can provide richer information, they also introduce significant challenges.

High dimensional data creates three major difficulties:

Visualization becomes impractical. Beyond three dimensions, we cannot directly see structure. Patterns hidden in a 100 column dataset are difficult to interpret intuitively.
The curse of dimensionality emerges. As dimensionality increases, distances between points become less informative. Algorithms that rely on geometry such as clustering or k-nearest neighbors begin to degrade.
Redundancy increases. Many features are correlated or carry overlapping information, leading to unnecessary computational cost and noise.

Dimensionality reduction addresses these issues by transforming data from a high dimensional space into a lower dimensional representation. The goal is not merely compression, but compression that preserves meaningful structure.

Two Broad Goals

Different dimensionality reduction techniques preserve different aspects of the data. Most methods emphasize one or both of the following objectives:

Variance preservation Capture the directions along which the data varies most. This is useful for noise reduction, feature extraction, and preparing inputs for supervised learning models.
Neighbourhood preservation Maintain local relationships so that points that are close in the original space remain close in the reduced space. This is especially valuable for visualization and discovering cluster structure.

Understanding these differences is essential. Some methods are best suited for preparing data for downstream models, while others are designed primarily for exploration and visual insight.