Data Engineering at the University of Florida
Due: Monday, April 20, 2026 by 8:30 AM (slides + demo recording in repo) Points: 150 Format: 5-7 minute group presentation with recorded demo
The final presentation showcases your project in a small-group setting. On Monday, you present to a group of 4-5 classmates working on similar topics. Your group votes on the best presentation. On Wednesday, group winners present to the whole class and the class votes on the overall best. Winners earn extra credit.
| Day | What Happens |
|---|---|
| Mon, Apr 20 | Small-group presentations (5 groups, simultaneous) |
| Mon, Apr 20 | Each group votes on the best presentation |
| Wed, Apr 22 | Group winners present to the whole class |
| Wed, Apr 22 | Class votes on overall best, extra credit awarded |
View the full groups page — formatted for display in class.
Groups are organized by project topic so that presenters and audience share enough context to ask good questions and give informed feedback.
| Student | Project |
|---|---|
| Atul Arun | Self-Healing Civic Data Pipelines with MCP-Orchestrated Inspection |
| Nikhitha Nagabhyru | Self-Healing ETL Pipelines via LLM Orchestration and RAG |
| Sai Meghana Barla | Drift-Aware Self-Healing ETL Framework |
| Vivek Chenganassery | Adaptive Context Compression for Large Log Datasets |
| Sai Teja Appani | Benchmarking LLM-Orchestrated vs Traditional ETL |
| Student | Project |
|---|---|
| Harris Barton | LLM-Orchestrated Data Integration from Heterogeneous Book APIs |
| Kevin Tran | LLM-Orchestrated Gainesville Open Data Integration Pipeline |
| Vatsal Shah | Cross-City Building Permit Integration with MCP-Based ETL |
| Vittal Chintamaneni | NavFusion: LLM-Based Route Optimization Using MCP |
| Siyuan Pan | Deterministic ETL Pipeline for NYC 311 and Restaurant Data |
| Student | Project |
|---|---|
| Ian Arnold | LLM vs Regex for Clinical Note Extraction |
| Sri Ashritha Appalchity | LLM vs Trained Classifier for Entity Resolution |
| Sanya Chaturvedi | Comparing Rule-Based and LLM-Orchestrated Pipelines |
| Zachary Zeng | Comparing Multi-LLM Agents: Decomposition vs Augmentation |
| Xiaomeng Xiong | Language Intervention Framework for Evaluating LLMs |
| Student | Project |
|---|---|
| Rukaiya Khan | F1 Race Strategy Intelligence Pipeline |
| Sanjeev Kamath | Travel Risk Assessment Stability Analysis |
| Palavalli Shyam | LLM-Orchestrated Data Pipeline for Job Market Extraction |
| Kanishka Dhaundiyal | LLM-Driven Session Prediction with RAG |
| Zachary Allen | RAG-Augmented Search for PubChem Database |
| Student | Project |
|---|---|
| Adnan Farid | LLM-Orchestrated Data Cleaning with MCP Tools |
| Juan Veliz | LLM-Augmented Security Triage Pipeline over GitHub Code |
| Shane Thomas | Agentic Knowledge Graphs for Lateral Movement Detection |
| Jiangwei Wang | Cost-Aware Hybrid LLM Pipeline for Municipal Permit Data |
| Section | Time | Content |
|---|---|---|
| Introduction | 1 min | Problem, motivation, research question |
| Approach | 1-2 min | System architecture and key design decisions |
| Demo Recording | 2-3 min | Pre-recorded pipeline walkthrough (see below) |
| Results | 1-2 min | Key findings with evidence |
| Takeaway | 30 sec | One sentence the audience should remember |
Introduction (1 minute)
Approach (1-2 minutes)
Demo Recording (2-3 minutes)
Results (1-2 minutes)
Takeaway (30 seconds)
Every presenter must include a pre-recorded screen recording of their pipeline in action. You do not need to run code live. Step through the pipeline and show what it does.
Place the recording in your project repo:
cis6930sp26-project/
├── presentation/
│ ├── slides.pdf
│ ├── demo.mp4 ← screen recording
│ └── demo_notes.md ← (optional) notes for your narration
└── ...
If the file is too large for GitHub, upload to YouTube (unlisted) or Google Drive and put the link in demo_notes.md.
| Time | Activity |
|---|---|
| 8:30-8:35 | Instructions, move to group stations |
| 8:35-9:10 | Group presentations (5 students x ~6 min) |
| 9:10-9:15 | Vote for best presentation in your group |
| 9:15-9:20 | Results, announce Wednesday schedule |
Presentation order within each group: alphabetical by last name.
Each group will occupy a section of the room. Present from your laptop screen. Group-mates gather around to watch. A 7-minute hard cap applies; I will circulate as timekeeper.
| Time | Activity |
|---|---|
| 8:30-8:33 | Announce the 5 group winners |
| 8:33-9:08 | Winners present to whole class (7 min each, with brief Q&A) |
| 9:08-9:13 | Class votes for overall best |
| 9:13-9:20 | Results, extra credit, course wrap-up |
| Criterion | Weight | Points | What Evaluators Look For |
|---|---|---|---|
| Content | 30% | 45 | Clear problem, approach, results, and takeaway |
| Clarity | 23% | 35 | Logical flow, audience can follow without prior context |
| Demo Recording | 27% | 40 | Pipeline walkthrough is clear, narrated, shows key steps |
| Delivery | 20% | 30 | Prepared, audible, engages the audience, within time |
| Total | 100% | 150 |
| Score | Meaning |
|---|---|
| 5 | Excellent — conference-quality presentation |
| 4 | Good — professional and engaging |
| 3 | Satisfactory — gets the message across |
| 2 | Below average — hard to follow or missing elements |
| 1 | Poor — does not meet basic requirements |
presentation/slides.pdfpresentation/demo.mp4 (or link in demo_notes.md)main branch by Monday 8:30 AM