Supervised Learning

6.2. Supervised Learning#

Supervised learning is one of the central paradigms in machine learning. In this setting, we train a model using examples where the correct answers are already known. The model observes input data along with the corresponding outputs and learns the relationship between them.

You can think of supervised learning as learning from labeled examples. Each example tells the model not only what the input looks like, but also what the correct prediction should be.

What Is Supervised Learning?

In supervised learning, we work with three core components:

Features (X): The input variables used for prediction Examples: house size, location, age of property
Labels (y): The output or target variable we want to predict Examples: house price, disease diagnosis
Goal: Learn a function \( f(X) \rightarrow y \) that maps inputs to outputs as accurately as possible.

The term “supervised” refers to the presence of labeled training data. During training, the model is shown both the inputs and the correct outputs, allowing it to adjust its internal parameters to reduce prediction error.

Two Main Types of Supervised Learning

Supervised learning problems fall into two primary categories, depending on the type of target variable.

1. Regression: Predicting Numbers

Regression is used when the target variable is continuous, meaning it takes numerical values.

Examples include:
- Predicting house prices
- Forecasting stock prices
- Estimating customer lifetime value
- Predicting temperature
Example algorithms:
- Linear Regression
- Ridge
- Lasso
- Decision Tree Regressor
- Random Forest Regressor
In regression, the model’s output is a real number.
2. Classification: Predicting Categories

Classification is used when the target variable is discrete, meaning it represents categories or classes.

Examples include:
- Email spam detection (spam or not spam)
- Disease diagnosis (healthy or sick)
- Customer churn prediction (will churn or will not churn)
- Image classification (cat, dog, bird)
Example algorithms:
- Logistic Regression
- K-Nearest Neighbors
- Decision Trees
- Naive Bayes
- Support Vector Machines
- Random Forest
In classification, the model predicts which class an input belongs to.

Supervised learning forms the foundation for many practical machine learning systems. Whether predicting numbers or assigning categories, the core idea remains the same: learn a mapping from inputs to outputs using labeled examples.