CIS 6930 Spring 26

Logo

Data Engineering at the University of Florida

Course Schedule - Spring 2026

Key Dates


Assignments

Assignment Due Date Points Description
HiPerGator Training Jan 23, 8:30 AM 5 Complete UF RC training
Assignment 0 Jan 28, 11:59 PM 50 Python, GitHub, LLM basics
Assignment 1.5 Feb 11, In-Class 12 MCP on HiPerGator (in-class)
Assignment 1 Feb 18, 11:59 PM 100 MCP data pipeline
Quiz: ML Fundamentals Feb 18, 8:30 AM 10 sklearn API, metrics
Assignment 2 Mar 9, 11:59 PM 60 RAG system implementation

Project Milestones

Repository: cis6930sp26-project (individual)

Milestone Due Date Points Description
Project Proposal Feb 25, 11:59 PM 100 Research question + initial design
Code Checkpoint Mar 23, 11:59 PM 150 Working prototype
Draft Paper Mar 30, 11:59 PM 100 Complete draft
Final Paper Apr 13, 11:59 PM 400 Full evaluation
Presentation Week of Apr 20 150 10-min presentation

More assignments will be released as the semester progresses.


Weekly Schedule

Week 1: Course Introduction (Jan 12, 14, 16)

Day Topic Activity
Mon Course Introduction Syllabus, expectations
Wed Git/GitHub Crash Course Live demo
Fri Topics Overview Data engineering + LLMs

Assigned: Assignment 0


Week 2: MCP & Data Extraction (Jan 21, 23)

Monday Jan 19 - MLK Day, no class

Day Topic Activity
Wed MCP Fundamentals Architecture, primitives
Fri HiPerGator Setup Matt Gitzendanner guest talk

Due: Assignment 0 (Jan 28), HiPerGator Training (Jan 23) Assigned: Assignment 0.5 (Skipped)

Readings: MCP Specification, MCP Quickstart Guide (see lecture readings)


Week 3: Prompt Engineering (Jan 27, 29, 31)

Day Topic Activity
Mon Prompting Fundamentals Zero-shot, few-shot, prompt structure
Wed Chain-of-Thought CoT theory, Zero-shot CoT, Tree/Graph of Thoughts
Fri Reading Papers + Structured Outputs 3-pass method, JSON mode

Due: Assignment 0 (Jan 28) Assigned: Assignment 1, Assignment 2

Readings: Chain-of-Thought Prompting, How to Read a Paper (see lecture readings)

Tools: Navigator CLI - Command-line tool for querying NavigatorAI


Week 4: Data Integration (Feb 2, 4, 6)

Day Topic Activity
Mon Prompt Engineering Lab Structured outputs
Wed Data Integration Schema mapping
Fri Entity Resolution Traditional methods

Assigned: Assignment 1 (due Feb 18)

Readings: LinkTransformer (Arora & Dell, ACL 2024) (see lecture readings)


Week 5: Data at Scale & Guest Speakers (Feb 9, 11, 13)

Day Topic Activity
Mon Guest Speaker: Dr. Shiree Hughes Big Data at General Motors
Wed Assignment 1.5: MCP on HiPerGator In-class hands-on activity
Fri Guest Speaker: Mikhail Sinanan Data Engineering at Spotify

Due: Assignment 1.5 (Feb 11, In-Class)

Readings: Apache Kafka, Spark, Hadoop introductions; Spotify Data Platform (see lecture readings)

Lecture: Slides from Dr. Hughes are available here.


Week 6: ML Fundamentals (Feb 16, 18, 20)

Day Topic Activity
Mon ML Fundamentals Classification, regression, clustering; sklearn API
Wed Evaluation Metrics ROC/AUC, cross-validation, hyperparameter tuning
Fri Project Proposal Workshop Example proposals, research question refinement

Due: Assignment 1 (Feb 18), Quiz: ML Fundamentals (Feb 18, 8:30 AM), Peer Reviews (Feb 21) Assigned: Project Proposal (due Feb 25)

Friday Workshop:

Topics Covered:


Week 7: RAG Systems (Feb 23, 25, 27)

Day Topic Activity
Mon RAG Architecture Components, retrieval methods, augmentation strategies
Wed Vector Databases Chroma, FAISS hands-on demo
Fri Chunking Strategies Discussion

Due: Project Proposal (Feb 25) Assigned: Assignment 2 (RAG System, 60 pts, due Mar 9)

Readings: Lewis et al. (2020) RAG paper, COLING 2025 RAG Best Practices (see lecture readings)

Supplementary: Embeddings Primer - Background on text embeddings for students who need a refresher


Week 8: LLM Evaluation (Mar 2, 4, 6)

Day Topic Activity
Mon LLM Evaluation Fundamentals Metrics, benchmarks, and non-determinism
Wed Building Benchmarks RAGAS + evaluation demo
Fri Project Workshop Pitch circle, pair debugging, office hours

Upcoming: Assignment 2 (Mar 9, 11:59 PM)

Readings: RAGAS (EACL 2024), In Benchmarks We Trust… Or Not? (see lecture readings)


Week 9: Project Work (Mar 9, 11, 13)

Day Topic Activity
Mon RAGAS Demo + Q&A Assignment 2 support
Wed Project Work Session Code checkpoint preparation
Fri Project Work Session Individual consultations

Due: Assignment 2 (Mar 9, 11:59 PM) Upcoming: Code Checkpoint (Mar 23)


⏸️ Spring Break (March 14-21)

No classes. Enjoy your break!

Return: Monday, March 23


Week 10: Ethics & Human-AI (Mar 23, 25, 27)

Day Topic Activity
Mon AI Fairness & Bias Bias in data systems
Wed Human-in-the-Loop Systems Agent design patterns
Fri Work Session Code checkpoint feedback

Due: Code Checkpoint (Mar 23, 11:59 PM)

Readings: Bias and Fairness in LLMs Survey, Building Effective Agents (see lecture readings)


Week 11: Paper Writing Sprint (Mar 30, Apr 1, 3)

Day Topic Activity
Mon Paper Writing Workshop Structure, style, clarity
Wed Work Session Paper writing
Fri Work Session Paper drafting

Due: Draft Paper (Mar 30, 11:59 PM)

Readings: How to Write a Great Research Paper (see lecture readings)


Week 12: Crowdsourcing (Apr 6, 8, 10)

Day Topic Activity
Mon Crowdsourcing & Annotation Platforms, quality control
Wed Annotation Design Guidelines, interfaces
Fri Work Session Paper revisions

Readings: HumEval Workshop, Capturing Perspectives of Annotators (see lecture readings)


Week 13: Peer Review (Apr 13, 15, 17)

Day Topic Activity
Mon Writing Good Reviews Conference standards
Wed Work Session Paper finalization
Fri Work Session Peer reviews

Due: Final Paper (Apr 13, 11:59 PM)

Readings: ACL Reviewer Guidelines (see lecture readings)


Week 14: Presentations (Apr 20, 22)

Day Topic Activity
Mon Course Wrap-up Lessons learned
Wed Presentations Final presentations

Due: Presentation (Week of Apr 20)


Week 15: Finals (Apr 25 - May 1)


back