CIS 6930 Spring 26

Logo

Data Engineering at the University of Florida

Assignment 1.5: MCP on HiPerGator (In-Class)

Date: Wednesday, February 11, 2026 Due: Before next class (Friday, February 13, 2026 at 8:30 AM) Points: 12 (Infrastructure Assignment) Submission: Canvas (GitHub repo URL + reflection)


Overview

In this hands-on activity, you will set up and run an MCP server on HiPerGator that processes data from a Hugging Face dataset. This activity is designed as a live walkthrough during class. The instructor will demonstrate each step, share configuration details, and help troubleshoot issues in real-time.

This exercise directly prepares you for Assignment 1: MCP Data Pipeline by ensuring you can run MCP servers on HiPerGator’s compute nodes.


Before Class

Complete these steps before arriving to class:

  1. Create your GitHub repository:
    • Create a new private repository named cis6930sp26-assignment1.5
    • Add cegme as an Admin collaborator on your repository
    • Initialize with a README
  2. Verify HiPerGator access: SSH into HiPerGator and confirm you can log in
    ssh YOUR_GATORLINK@hpg.rc.ufl.edu
    
  3. Clone your repository to your HiPerGator home directory:
    cd ~
    git clone https://github.com/YOUR_USERNAME/cis6930sp26-assignment1.5.git
    cd cis6930sp26-assignment1.5
    
  4. Request a Hugging Face token (if you don’t have one):

Part 1: Environment Setup on HiPerGator (10 min)

Step 1.1: Load Required Modules

The specific module versions are critical. We will demonstrate the exact commands during class.

# Module commands will be provided during class walkthrough
module load ____________
module load ____________
module load ____________

In-class note: The instructor will share the specific module versions that are tested and working on HiPerGator. These versions may differ from the default modules.

Step 1.2: Create Python Environment

# Navigate to project directory
cd ~/cis6930sp26-assignment1.5

# Create virtual environment using uv
# The exact uv installation method for HiPerGator will be demonstrated
____________
____________

Step 1.3: Configure Environment Variables

Create a .env file with your credentials:

HUGGINGFACE_TOKEN=____________

In-class note: Additional configuration details will be shared during the live walkthrough.


Part 2: The Hugging Face Dataset (5 min)

We will use the dair-ai/emotion dataset from Hugging Face, which contains text samples labeled with emotions (sadness, joy, love, anger, fear, surprise).

This dataset is ideal for practicing MCP because:

Dataset Structure

Column Type Description
text string The text sample
label int Emotion label (0-5)

Labels mapping:


Part 3: Build the MCP Server (15 min)

Step 3.1: Review the Server Template

Open server.py and examine the structure:

from mcp.server.fastmcp import FastMCP
from datasets import load_dataset
from loguru import logger
import os

# Load environment variables
from dotenv import load_dotenv
load_dotenv()

# Initialize MCP server
mcp = FastMCP("EmotionDataProcessor")

# Dataset will be loaded once when server starts
_dataset = None

def get_dataset():
    """Lazy-load the emotion dataset."""
    global _dataset
    if _dataset is None:
        logger.info("Loading emotion dataset from Hugging Face...")
        _dataset = load_dataset("dair-ai/emotion", split="train")
        logger.info(f"Loaded {len(_dataset)} samples")
    return _dataset


@mcp.tool()
def get_sample(n: int = 5) -> str:
    """Get n random samples from the emotion dataset.

    Args:
        n: Number of samples to retrieve (default: 5, max: 20)

    Returns:
        JSON string with samples including text and emotion label
    """
    # Implementation during class
    pass


@mcp.tool()
def count_by_emotion(emotion: str) -> str:
    """Count samples for a specific emotion.

    Args:
        emotion: One of 'sadness', 'joy', 'love', 'anger', 'fear', 'surprise'

    Returns:
        JSON string with count and percentage
    """
    # Implementation during class
    pass


@mcp.tool()
def search_text(query: str, limit: int = 10) -> str:
    """Search for samples containing specific text.

    Args:
        query: Text to search for (case-insensitive)
        limit: Maximum results to return (default: 10)

    Returns:
        JSON string with matching samples
    """
    # Implementation during class
    pass


@mcp.tool()
def analyze_emotion_distribution() -> str:
    """Get the distribution of emotions in the dataset.

    Returns:
        JSON string with counts and percentages for each emotion
    """
    # Implementation during class
    pass

Step 3.2: Implement the Tools (Live Coding)

During class, we will implement each tool together. The instructor will:

Your task: Follow along and implement the tools in your server.py file.


Part 4: Run the MCP Server on HiPerGator (10 min)

Step 4.1: Request an Interactive Session

We will use SLURM to request compute resources:

# SLURM command with specific parameters for our class allocation
srun --account=____________ \
     --qos=____________ \
     --ntasks=1 \
     --cpus-per-task=2 \
     --mem=4gb \
     --time=00:30:00 \
     --pty bash -i

In-class note: The account and QOS values specific to our class allocation will be provided during the walkthrough.

Step 4.2: Start the MCP Server

# Activate environment and start server
source .venv/bin/activate
uv run python server.py

You should see output like:

2026-02-11 08:45:23.456 | INFO     | Loading emotion dataset from Hugging Face...
2026-02-11 08:45:28.123 | INFO     | Loaded 16000 samples
2026-02-11 08:45:28.124 | INFO     | MCP server 'EmotionDataProcessor' starting...

Step 4.3: Test with MCP Inspector

In a new terminal (keeping the server running), connect to your running job:

# SSH to HiPerGator
ssh YOUR_GATORLINK@hpg.rc.ufl.edu

# Find your job ID and connect to it
squeue -u $USER
srun --pty --overlap --jobid ____________ bash

# Navigate to project and run inspector
cd ~/cis6930sp26-assignment1.5
source .venv/bin/activate
uv run mcp dev server.py

Part 5: Capture Your Outputs (5 min)

For submission, you need to capture the following outputs from your MCP server:

Required Outputs

  1. Tool List Output: Run the list_tools command in MCP Inspector and capture the output showing all 4 tools.

  2. Sample Data Output: Call get_sample(n=3) and capture the returned samples.

  3. Emotion Distribution: Call analyze_emotion_distribution() and capture the distribution statistics.

  4. Custom Search: Call search_text(query="happy") and capture the results.

Capture Commands

# Save outputs to files for submission
# Commands will be demonstrated during class

Submission

Submit the following to Canvas before Friday, February 13, 2026 at 8:30 AM:

  1. GitHub Repository URL - Your cis6930sp26-assignment1.5 repository containing:
    • server.py with all 4 tools implemented
    • pyproject.toml with dependencies
    • .env.example (do NOT commit your actual .env file)
    • outputs.txt with your tool call outputs
  2. outputs.txt must contain:
    • Your GatorLink username
    • The compute node you were assigned (e.g., c0701a-s17)
    • Output from all 4 tool calls (copy-paste from terminal)
  3. Reflection (2-3 sentences) in Canvas submission:
    • What was one challenge you encountered?
    • How does this prepare you for Assignment 1?

Commit and Push Before Submitting

git add server.py pyproject.toml .env.example outputs.txt
git commit -m "feat: complete MCP in-class activity"
git push origin main

Grading Rubric

Criterion Points
MCP server runs without errors 3
All 4 tools implemented and functional 5
Outputs captured and submitted 2
Reflection demonstrates understanding 2
Total 12

Troubleshooting

Common Issues (Discussed During Class)

Issue Solution
ModuleNotFoundError: No module named 'mcp' ____
Connection refused on MCP Inspector ____
CUDA out of memory ____
HuggingFace rate limit ____

In-class note: Solutions to these common issues will be demonstrated live. The specific fixes depend on HiPerGator’s current configuration.


After Class

If you completed this activity successfully, you are ready for Assignment 1. Consider:

  1. Extending your server with additional tools for your project dataset
  2. Reviewing the MCP documentation for advanced features
  3. Testing your server with an LLM client (NavigatorAI)

Resources


Academic Integrity

This is an individual in-class activity. You may:

You may not:


This activity is designed as a live walkthrough. Students who attend class will receive step-by-step guidance, while those who miss class will need to figure out the HiPerGator-specific configuration details independently.


Last updated: February 2026