6.1.1. What Does It Mean to Model?#

Here, let’s explore what a model actually is and why models are central to machine learning. You might have heard terms like “machine learning model” or “predictive model” without fully understanding what they mean. Let’s build that intuition from the ground up.

6.1.1.1. Why We Need Models#

Imagine you’re running an online store and want to predict which products a customer might buy based on their browsing history. Or perhaps you’re a doctor trying to diagnose a disease based on symptoms and test results. In both cases, you’re trying to find patterns in data to make predictions or decisions.

Traditionally, we might approach these problems using explicit rules:

  1. Start with a base price of $100,000

  2. Add $100 for every square foot

  3. Add $15,000 for each bedroom

  4. If the house is downtown, multiply the price by 1.5

  5. If the house is in the suburbs, multiply the price by 1.2

Example: For a 1,500 square foot, 3-bedroom house in downtown:

  • Base: $100,000

  • Size: 1,500 × \(100 = \)150,000

  • Bedrooms: 3 × \(15,000 = \)45,000

  • Subtotal: $295,000

  • Downtown multiplier: $295,000 × 1.5 = $442,500

This approach has serious limitations:

  1. Hard to capture complex patterns - Real relationships are rarely this simple

  2. Requires domain expertise - You need to know all the rules upfront

  3. Doesn’t adapt - Rules don’t improve with more data

  4. Misses hidden patterns - Human experts can’t see every pattern in the data

This is where machine learning models come in. Instead of writing explicit rules, we let the model learn patterns from data.

6.1.1.2. Models as Functions: The Core Concept#

At its heart, a model is simply a mathematical function that maps inputs to outputs:

\[f(x) = y\]

Where:

  • x represents your input features (square feet, bedrooms, location)

  • y represents your target output (house price)

  • f is the model that learns the mapping

The key difference from traditional programming:

  • Traditional programming: You design the function f explicitly

  • Machine learning: The algorithm discovers f by learning from examples

Let’s see a concrete example. Don’t worry about understanding all the code details yet, we’re just demonstrating how a model works.

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

# Generate synthetic data
np.random.seed(42)
square_feet = np.array([800, 1000, 1200, 1500, 1800, 2000, 2200, 2500, 2800, 3000])
prices = 50000 + 150 * square_feet + np.random.normal(0, 20000, size=10)

X = square_feet.reshape(-1, 1)
y = prices

# Train the model
model = LinearRegression()
model.fit(X, y)

# Make predictions
new_house_sqft = np.array([[1600], [2400]])
predictions = model.predict(new_house_sqft)

The model learned the function: price = 61050 + 149 × square_feet

For a 1600 sq ft house, it predicts $299,272, and for a 2400 sq ft house, $418,384.

Hide code cell source

plt.figure(figsize=(10, 6))
plt.scatter(square_feet, prices, color='blue', s=100, alpha=0.6, label='Training Data')
plt.plot(square_feet, model.predict(X), color='red', linewidth=2, label='Learned Model')
plt.scatter(new_house_sqft, predictions, color='green', s=100, marker='x', linewidths=3, label='Predictions')

plt.xlabel('Square Feet', fontsize=12)
plt.ylabel('Price ($)', fontsize=12)
plt.title('Model as a Function: Mapping Square Feet to Price', fontsize=14, fontweight='bold')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
../../../_images/aa5f1a51531cdf854c88ea081e14d81bc7b5e6956254d8f1afcb0244b6569731.png

What happened:

  1. We gave the model examples (square feet and prices)

  2. The model discovered the pattern automatically

  3. Now it can predict prices for houses it’s never seen

This is learning from data.

6.1.1.3. The Machine Learning Workflow#

Every machine learning project follows this general workflow:

1. Collect Data#

You need examples showing both inputs (features) and outputs (target values for supervised learning).

2. Train the Model#

The model analyzes the data to discover patterns. This process is called training or fitting.

model.fit(X_train, y_train)  # Model learns patterns

3. Make Predictions#

Once trained, the model can make predictions on new, unseen data. This is called inference or prediction.

predictions = model.predict(X_new)  # Model applies learned patterns

4. Evaluate Performance#

Check how well the model performs on data it hasn’t seen before.

score = model.score(X_test, y_test)  # Measure accuracy

Let’s see this complete workflow:

from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error, r2_score

# Generate synthetic data
np.random.seed(42)
n_samples = 100
square_feet = np.random.uniform(800, 3500, n_samples)
prices = 50000 + 150 * square_feet + np.random.normal(0, 30000, size=n_samples)

X = square_feet.reshape(-1, 1)
y = prices

# Split data into train and test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model
model = LinearRegression()
model.fit(X_train, y_train)

# Make predictions
y_pred = model.predict(X_test)

# Evaluate model
mae = mean_absolute_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

Model Performance:

  • Mean Absolute Error: $ 17,740

  • R² Score: 0.961

This means predictions are off by $ 17,740 on average.

6.1.1.4. Parameters vs. Hyperparameters#

When working with models, you’ll encounter two types of values:

Parameters#

Values that the model learns from the data during training.

In our linear model, the parameters are:

  • Intercept: The baseline price

  • Coefficient: How much price increases per square foot

These are discovered automatically by the learning algorithm, giving us:

  • Intercept: $57,855

  • Coefficient: $145.54 per sq ft

Hyperparameters#

These are values that you set before training to control how the model learns.

Examples:

  • The type of model (linear, tree, neural network)

  • Complexity settings (tree depth, number of layers)

  • Learning rate (how fast the model adjusts)

# Hyperparameters (you choose these)
from sklearn.tree import DecisionTreeRegressor

model = DecisionTreeRegressor(
    max_depth=5,           # Hyperparameter: How deep the tree can grow
    min_samples_split=10   # Hyperparameter: Minimum samples to split
)

Key distinction:

  • Parameters: The model learns these (weights, coefficients)

  • Hyperparameters: You set these (model architecture, training settings)

6.1.1.5. Types of Models#

Models can be broadly categorized by how they represent patterns:

Parametric Models#

Have a fixed form with a specific number of parameters.

Examples:

  • Linear Regression: y = mx + b (2 parameters: m, b)

  • Logistic Regression

  • Neural Networks (fixed architecture)

Advantages:

  • Fast to train

  • Easy to interpret

  • Work well with less data

Disadvantages:

  • Strong assumptions about data shape

  • Limited flexibility

from sklearn.linear_model import LinearRegression

# Train model
model_parametric = LinearRegression()
model_parametric.fit(X_train, y_train)
LinearRegression()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

This model will always have the form: y = mx + b. No matter how much data we give it, it’s constrained to a line. For our example, the model is: 145.5x + 57855.0 with just 2 parameters (slope and intercept).

Non-Parametric Models#

Can grow in complexity with the amount of data.

Examples:

  • k-Nearest Neighbors

  • Decision Trees

  • Kernel SVMs

Advantages:

  • High flexibility

  • Few assumptions

  • Can capture complex patterns

Disadvantages:

  • Need more data

  • Can be slower

  • Risk of overfitting

6.1.1.6. Training vs. Inference#

It’s important to understand these two distinct phases:

Training (Learning Phase)#

  • The model analyzes examples

  • Adjusts internal parameters

  • Discovers patterns

  • Computationally expensive - can take hours or days

# Training: Model learns (slow)
model.fit(X_train, y_train)  # Could take minutes to hours

Inference (Prediction Phase)#

  • The model applies learned patterns

  • Makes predictions on new data

  • Parameters are frozen - no learning

  • Fast - milliseconds to seconds

# Inference: Model predicts (fast)
predictions = model.predict(X_new)  # Typically milliseconds

This distinction is crucial:

  • You train once (or periodically)

  • You predict many times (potentially millions of times)

6.1.1.7. When to Use Models#

Models are powerful tools, but they are not always the right solution. Use models when:

  1. Patterns are complex – The relationships in the data are too intricate for fixed rules.

  2. Sufficient data is available – You have enough labeled or historical examples to learn from.

  3. Patterns evolve over time – The system can be retrained as new data arrives.

  4. Approximate predictions are acceptable – Some level of error is tolerable.

Avoid using models when:

  • A simple, deterministic rule can solve the problem reliably.

  • Data is scarce or low quality.

  • The system requires near perfect accuracy, such as in safety critical settings.

  • The cost of incorrect predictions is prohibitively high.