CIS 6930 Spring 26

Logo

Data Engineering at the University of Florida

Assignment 3: Fairness Audit

Due: Monday, March 30, 2026 at 11:59 PM Points: 60 (50 implementation + 10 reflection) Submission: GitHub repository + Canvas link


Overview

In this assignment, you will conduct a fairness audit of an AI system. You will train a RandomForestClassifier on a real dataset, compute fairness metrics, analyze where bias enters the pipeline, and write a reflection connecting your findings to the social implications of algorithmic bias.

This assignment combines the technical metrics from Day 23 with the social analysis from Day 24.


Learning Objectives

By completing this assignment, you will:

  1. Compute and interpret multiple fairness metrics on a real dataset
  2. Identify sources of bias in an ML pipeline
  3. Apply bias mitigation techniques and measure their effects
  4. Articulate the trade-offs between competing fairness definitions
  5. Connect technical metrics to social consequences

Setup

1. Create Your Repository from the Template

  1. Go to the starter template: cegme/cis6930sp26-assignment3-starter
  2. Click “Use this template”“Create a new repository”
  3. Name your repository cis6930sp26-assignment3
  4. Set it to Private
  5. Clone your new repository:
git clone https://github.com/YOUR_USERNAME/cis6930sp26-assignment3.git
cd cis6930sp26-assignment3

# Install dependencies
uv sync

2. Verify Setup

Run the tests to confirm everything is installed. All tests should fail with NotImplementedError:

uv run pytest

3. The Dataset

You will work with the Adult Income Dataset (UCI ML Repository), which predicts whether an individual earns more than $50K/year. This dataset has known biases related to gender and race.

The starter code (data.py) loads and preprocesses the dataset for you.

Features include: age, education, occupation, hours-per-week, marital-status, etc. Protected attributes: sex, race Target: income (>50K or <=50K)


What Is Provided

The following files are already implemented for you:

You can also use fairlearn metrics in your code if you find them helpful, but you must implement the 6 metrics yourself in fairness_metrics.py.


Steps to Complete the Assignment

Follow these steps in order. Each step builds on the previous one.

Step 1: Implement fairness metrics (20 points)

File: assignment3/fairness_metrics.py

Start here because these functions have no dependencies on other modules and the tests are the most straightforward.

Implement all 6 metrics:

  1. statistical_parity_difference — Difference in positive prediction rates between groups
  2. disparate_impact_ratio — Ratio of positive prediction rates between groups
  3. equal_opportunity_difference — Difference in true positive rates between groups
  4. average_odds_difference — Average of TPR and FPR differences between groups
  5. predictive_parity_difference — Difference in precision between groups
  6. theil_index — Measures inequality in correct predictions across individuals

Use the _split_by_group helper (provided) to separate arrays into privileged and unprivileged groups.

Verify: uv run pytest tests/test_fairness_metrics.py

Step 2: Build the audit (10 points)

File: assignment3/audit.py

Implement run_audit(y_true, y_pred, protected_attributes) which:

  1. For each protected attribute (e.g., sex, race), computes all 5 group fairness metrics using your functions from Step 1
  2. Computes the Theil index once (it is not group-specific)
  3. Checks each metric against FAIR_THRESHOLDS using the provided is_fair() helper
  4. Returns a nested dictionary (see the docstring for the exact structure)

The return value must follow this structure:

{
    "sex": {
        "statistical_parity_difference": {"value": -0.19, "fair": False},
        "disparate_impact_ratio": {"value": 0.27, "fair": False},
        # ... other metrics
    },
    "race": {
        # ... same metrics for race
    },
    "theil_index": {"value": 0.12, "fair": False},
}

Verify: uv run pytest tests/test_audit.py

Try it: uv run python -m assignment3.audit

Step 3: Run the mitigation comparison (5 points)

The mitigation code is provided in mitigate.py. Run it and record the before/after metrics:

uv run python -m assignment3.mitigate

Compare the baseline audit results with the mitigated results. Fill in the mitigation table in your README.md.

Step 4: Run the full pipeline (5 points)

Confirm all tests pass and the full pipeline runs:

uv run pytest                          # All tests pass
uv run python -m assignment3.main      # Full pipeline runs

Step 5: Fill in your results (0 points, but required)

Update README.md with your audit results tables (baseline and mitigated).

Step 6: Write the reflection (10 points)

File: REFLECTION.md (minimum 500 words total)

  1. Metric Conflicts (3 points): Show a specific example from your audit where improving one fairness metric worsened another. Explain why this happens using the impossibility theorem.

  2. Social Context (4 points): The Adult Income dataset was collected from the 1994 Census. Discuss how historical and structural factors (e.g., occupational segregation, educational access, wealth gaps) are encoded in this dataset. Reference at least one reading from class (Birhane, Mitchell, or Aspen Digital).

  3. Metric Selection (3 points): If this model were deployed for real loan decisions, which fairness metric would you prioritize and why? What harms does your chosen metric fail to capture? Who benefits and who bears the cost of your choice?


Running the System

# Run tests
uv run pytest

# Train baseline model and view accuracy
uv run python -m assignment3.model

# Run fairness audit
uv run python -m assignment3.audit

# Apply mitigation and compare
uv run python -m assignment3.mitigate

# Run all steps
uv run python -m assignment3.main

Grading

Implementation (50 points)

Component Points Description
Fairness metrics (fairness_metrics.py) 20 All 6 metrics implemented correctly
Audit report (audit.py) 10 Returns correct dictionary structure with metrics and fair flags
Mitigation comparison 5 Before/after results recorded in README
Tests pass 5 All provided tests pass
README results filled in 10 Audit tables and key findings completed

Reflection (10 points)

Component Points Description
Metric conflicts 3 Concrete example with impossibility theorem explanation
Social context 4 Structural factors analysis with reading references
Metric selection 3 Justified choice with trade-off analysis

| Total | 60 | |———–|——–|


README.md

Your README must include:

1. Setup Instructions

How to install dependencies and run the system.

2. Audit Results

Present your baseline audit results:

## Baseline Audit Results

### Protected Attribute: Sex

| Metric | Value | Fair? |
|--------|-------|-------|
| Statistical Parity Difference | -0.XX | Yes/No |
| Disparate Impact Ratio | 0.XX | Yes/No |
| Equal Opportunity Difference | -0.XX | Yes/No |
| Average Odds Difference | -0.XX | Yes/No |
| Predictive Parity Difference | 0.XX | Yes/No |
| Theil Index | 0.XX | Yes/No |

### Protected Attribute: Race
[Same table format]

3. Mitigation Results

Show before/after comparison:

## Mitigation Results

**Method used:** [Reweighing / Threshold Adjustment / Other]

| Metric | Before | After | Improved? |
|--------|--------|-------|-----------|
| Statistical Parity Difference | -0.XX | -0.XX | Yes/No |
| ...

4. Key Findings

Summarize your most important findings in 2-3 paragraphs.


COLLABORATORS.md

Document all collaboration and AI assistance (required).


Project Structure

cis6930sp26-assignment3/
├── assignment3/
│   ├── __init__.py
│   ├── data.py               # Provided: Loads Adult dataset
│   ├── model.py              # Step 2: Train RandomForestClassifier
│   ├── fairness_metrics.py   # Step 1: Implement 6 fairness metrics
│   ├── audit.py              # Step 3: Build audit report
│   ├── mitigate.py           # Step 4: Apply mitigation
│   └── main.py               # Provided: Runs all steps
├── tests/
│   ├── test_fairness_metrics.py
│   ├── test_model.py
│   ├── test_audit.py
│   └── test_mitigate.py
├── data/                     # Auto-downloaded by starter code
├── REFLECTION.md             # Step 7: Written reflection
├── COLLABORATORS.md
├── README.md                 # Step 6: Fill in results
├── .gitignore
└── pyproject.toml

Submission

  1. Create a private repository named cis6930sp26-assignment3
  2. Add cegme as an Admin collaborator
  3. Tag your final submission:
    git tag v1.0
    git push origin v1.0
    
  4. Submit the repository URL to Canvas

Tips

  1. Start with fairness_metrics.py (Step 1) — it has no dependencies and the tests are straightforward
  2. Use numpy boolean indexing — computing metrics per group is clean with y_pred[protected_attr == 'Male']
  3. Handle edge cases — division by zero when a group has no positive predictions
  4. Verify against AIF360 — optionally install aif360 to cross-check your metric implementations
  5. The reflection matters — it is worth 10 points and requires engagement with the course readings
  6. Run tests after each step — do not move to the next step until the current tests pass

Resources


Academic Integrity

This is an individual assignment. You may discuss concepts with classmates, but all code and the reflection must be your own. Document all collaboration in COLLABORATORS.md.