What Does It Mean to Model?

6.1.1. What Does It Mean to Model?#

Here, let’s explore what a model actually is and why models are central to machine learning. You might have heard terms like “machine learning model” or “predictive model” without fully understanding what they mean. Let’s build that intuition from the ground up.

6.1.1.1. Why We Need Models#

Imagine you’re running an online store and want to predict which products a customer might buy based on their browsing history. Or perhaps you’re a doctor trying to diagnose a disease based on symptoms and test results. In both cases, you’re trying to find patterns in data to make predictions or decisions.

Traditionally, we might approach these problems using explicit rules:

Start with a base price of $100,000
Add $100 for every square foot
Add $15,000 for each bedroom
If the house is downtown, multiply the price by 1.5
If the house is in the suburbs, multiply the price by 1.2

Example: For a 1,500 square foot, 3-bedroom house in downtown:

Base: $100,000
Size: 1,500 × $100 = $150,000
Bedrooms: 3 × $15,000 = $45,000
Subtotal: $295,000
Downtown multiplier: $295,000 × 1.5 = $442,500

This approach has serious limitations:

Hard to capture complex patterns - Real relationships are rarely this simple
Requires domain expertise - You need to know all the rules upfront
Doesn’t adapt - Rules don’t improve with more data
Misses hidden patterns - Human experts can’t see every pattern in the data

This is where machine learning models come in. Instead of writing explicit rules, we let the model learn patterns from data.

6.1.1.2. Models as Functions: The Core Concept#

At its heart, a model is simply a mathematical function that maps inputs to outputs:

\[f(x) = y\]

Where:

x represents your input features (square feet, bedrooms, location)
y represents your target output (house price)
f is the model that learns the mapping

The key difference from traditional programming:

Traditional programming: You design the function f explicitly
Machine learning: The algorithm discovers f by learning from examples

Let’s see a concrete example. Don’t worry about understanding all the code details yet, we’re just demonstrating how a model works.

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

# Generate synthetic data
np.random.seed(42)
square_feet = np.array([800, 1000, 1200, 1500, 1800, 2000, 2200, 2500, 2800, 3000])
prices = 50000 + 150 * square_feet + np.random.normal(0, 20000, size=10)

X = square_feet.reshape(-1, 1)
y = prices

# Train the model
model = LinearRegression()
model.fit(X, y)

# Make predictions
new_house_sqft = np.array([[1600], [2400]])
predictions = model.predict(new_house_sqft)

The model learned the function: price = 61050 + 149 × square_feet

For a 1600 sq ft house, it predicts $299,272, and for a 2400 sq ft house, $418,384.

../../../_images/aa5f1a51531cdf854c88ea081e14d81bc7b5e6956254d8f1afcb0244b6569731.png

What happened:

We gave the model examples (square feet and prices)
The model discovered the pattern automatically
Now it can predict prices for houses it’s never seen

This is learning from data.

6.1.1.3. The Machine Learning Workflow#

Every machine learning project follows this general workflow:

1. Collect Data#

You need examples showing both inputs (features) and outputs (target values for supervised learning).

2. Train the Model#

The model analyzes the data to discover patterns. This process is called training or fitting.

model.fit(X_train, y_train)  # Model learns patterns

3. Make Predictions#

Once trained, the model can make predictions on new, unseen data. This is called inference or prediction.

predictions = model.predict(X_new)  # Model applies learned patterns

4. Evaluate Performance#

Check how well the model performs on data it hasn’t seen before.

score = model.score(X_test, y_test)  # Measure accuracy

Let’s see this complete workflow:

from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error, r2_score

# Generate synthetic data
np.random.seed(42)
n_samples = 100
square_feet = np.random.uniform(800, 3500, n_samples)
prices = 50000 + 150 * square_feet + np.random.normal(0, 30000, size=n_samples)

X = square_feet.reshape(-1, 1)
y = prices

# Split data into train and test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model
model = LinearRegression()
model.fit(X_train, y_train)

# Make predictions
y_pred = model.predict(X_test)

# Evaluate model
mae = mean_absolute_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

Model Performance:

Mean Absolute Error: $ 17,740
R² Score: 0.961

This means predictions are off by $ 17,740 on average.

6.1.1.4. Parameters vs. Hyperparameters#

When working with models, you’ll encounter two types of values:

Parameters#

Values that the model learns from the data during training.

In our linear model, the parameters are:

Intercept: The baseline price
Coefficient: How much price increases per square foot

These are discovered automatically by the learning algorithm, giving us:

Intercept: $57,855
Coefficient: $145.54 per sq ft

Hyperparameters#

These are values that you set before training to control how the model learns.

Examples:

The type of model (linear, tree, neural network)
Complexity settings (tree depth, number of layers)
Learning rate (how fast the model adjusts)

# Hyperparameters (you choose these)
from sklearn.tree import DecisionTreeRegressor

model = DecisionTreeRegressor(
    max_depth=5,           # Hyperparameter: How deep the tree can grow
    min_samples_split=10   # Hyperparameter: Minimum samples to split
)

Key distinction:

Parameters: The model learns these (weights, coefficients)
Hyperparameters: You set these (model architecture, training settings)

6.1.1.5. Types of Models#

Models can be broadly categorized by how they represent patterns:

Parametric Models#

Have a fixed form with a specific number of parameters.

Examples:

Linear Regression: y = mx + b (2 parameters: m, b)
Logistic Regression
Neural Networks (fixed architecture)

Advantages:

Fast to train
Easy to interpret
Work well with less data

Disadvantages:

Strong assumptions about data shape
Limited flexibility

from sklearn.linear_model import LinearRegression

# Train model
model_parametric = LinearRegression()
model_parametric.fit(X_train, y_train)

LinearRegression()

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

This model will always have the form: y = mx + b. No matter how much data we give it, it’s constrained to a line. For our example, the model is: 145.5x + 57855.0 with just 2 parameters (slope and intercept).

Non-Parametric Models#

Can grow in complexity with the amount of data.