CIS 6930 Spring 26

Logo

Data Engineering at the University of Florida

Course Schedule - Spring 2026

Key Dates


Assignments

Assignment Due Date Points Description
HiPerGator Training Jan 23, 8:30 AM 5 Complete UF RC training
Assignment 0 Jan 28, 11:59 PM 50 Python, GitHub, LLM basics
Assignment 1.5 Feb 11, In-Class 12 MCP on HiPerGator (in-class)
Assignment 1 Feb 18, 11:59 PM 100 MCP data pipeline
Quiz: ML Fundamentals Feb 18, 8:30 AM 10 sklearn API, metrics
Assignment 2 Mar 10, 11:59 PM 60 RAG system implementation

Project Milestones

Repository: cis6930sp26-project (individual)

Milestone Due Date Points Description
Project Proposal Feb 25, 11:59 PM 100 Research question + initial design
Design Review Mar 2, 11:59 PM 100 Detailed architecture
Code Checkpoint Mar 23, 11:59 PM 150 Working prototype
Draft Paper Mar 30, 11:59 PM 100 Complete draft
Final Paper Apr 13, 11:59 PM 400 Full evaluation
Presentation Week of Apr 20 150 10-min presentation

More assignments will be released as the semester progresses.


Weekly Schedule

Week 1: Course Introduction (Jan 12, 14, 16)

Day Topic Activity
Mon Course Introduction Syllabus, expectations
Wed Git/GitHub Crash Course Live demo
Fri Topics Overview Data engineering + LLMs

Assigned: Assignment 0


Week 2: MCP & Data Extraction (Jan 21, 23)

Monday Jan 19 - MLK Day, no class

Day Topic Activity
Wed MCP Fundamentals Architecture, primitives
Fri HiPerGator Setup Matt Gitzendanner guest talk

Due: Assignment 0 (Jan 28), HiPerGator Training (Jan 23) Assigned: Assignment 0.5 (Skipped)

Readings: MCP Specification, MCP Quickstart Guide (see lecture readings)



Week 3: Prompt Engineering (Jan 27, 29, 31)

Day Topic Activity
Mon Prompting Fundamentals Zero-shot, few-shot, prompt structure
Wed Chain-of-Thought CoT theory, Zero-shot CoT, Tree/Graph of Thoughts
Fri Reading Papers + Structured Outputs 3-pass method, JSON mode

Due: Assignment 0 (Jan 28) Assigned: Assignment 1, Assignment 2

Readings: Chain-of-Thought Prompting, How to Read a Paper (see lecture readings)

Tools: Navigator CLI - Command-line tool for querying NavigatorAI


Week 4: Data Integration (Feb 2, 4, 6)

Day Topic Activity
Mon Prompt Engineering Lab Structured outputs
Wed Data Integration Schema mapping
Fri Entity Resolution Traditional methods

Assigned: Assignment 1 (due Feb 18)

Readings: LinkTransformer (Arora & Dell, ACL 2024) (see lecture readings)


Week 5: Data at Scale & Guest Speakers (Feb 9, 11, 13)

Day Topic Activity
Mon Guest Speaker: Dr. Shiree Hughes Big Data at General Motors
Wed Assignment 1.5: MCP on HiPerGator In-class hands-on activity
Fri Guest Speaker: Mikhail Sinanan Data Engineering at Spotify

Due: Assignment 1.5 (Feb 11, In-Class)

Readings: Apache Kafka, Spark, Hadoop introductions; Spotify Data Platform (see lecture readings)

Lecture: Slides from Dr. Hughes are available here.


Week 6: ML Fundamentals (Feb 16, 18, 20)

Day Topic Activity
Mon ML Fundamentals Classification, regression, clustering; sklearn API
Wed Evaluation Metrics ROC/AUC, cross-validation, hyperparameter tuning
Fri Project Proposal Workshop Example proposals, research question refinement

Due: Assignment 1 (Feb 18), Quiz: ML Fundamentals (Feb 18, 8:30 AM), Peer Reviews (Feb 21) Assigned: Project Proposal (due Feb 25)

Friday Workshop:

Topics Covered:


Week 7: RAG Systems (Feb 23, 25, 27)

Day Topic Activity
Sun RAG Architecture Components, retrieval methods, augmentation strategies
Wed Vector Databases Chroma, FAISS hands-on demo
Fri Chunking Strategies Design Review discussion

Due: Project Proposal (Feb 25), Design Review (Mar 2) Assigned: Assignment 2 (RAG System, 60 pts, due Mar 10)

Readings: Lewis et al. (2020) RAG paper, COLING 2025 RAG Best Practices (see lecture readings)

Supplementary: Embeddings Primer - Background on text embeddings for students who need a refresher


More weeks will be revealed as the semester progresses.


back