CIS 6930 Spring 26

Data Engineering at the University of Florida

Final Paper

Due: Monday, April 13, 2026 at 11:59 PM Points: 400 Submission: Push to cis6930sp26-project repository in paper/ directory with tag final

Overview

The final paper is the primary deliverable for your course project. It should be a polished research paper that presents your work with rigorous evaluation. This paper represents 40% of your total project grade.

Deliverables

Submit a complete, polished paper (8-10 pages) that:

Addresses feedback from draft peer reviews
Presents complete experimental results
Includes thorough analysis and discussion
Meets publication-quality standards

File Structure

cis6930sp26-project/
├── paper/
│   ├── paper.pdf          # Compiled paper
│   ├── paper.tex          # Source (or paper.md)
│   ├── figures/
│   └── references.bib
└── ...

Tagging Your Submission

git tag -a final -m "Final paper submission"
git push origin final

Rubric

The final paper is evaluated using conference-style peer review criteria.

Criterion	Weight	Points	Description
Originality	20%	80	Does the paper make a novel contribution?
Technical Quality	25%	100	Is the methodology sound and evaluation rigorous?
Clarity	20%	80	Is the paper well-written and easy to understand?
Significance	20%	80	Does this work address an important problem?
Reproducibility	15%	60	Can the results be reproduced by others?
Total	100%	400

Scoring Scale

Score	Meaning	Conference Equivalent
5	Excellent	Strong Accept
4	Good	Accept
3	Satisfactory	Weak Accept / Borderline
2	Below Average	Weak Reject
1	Poor	Reject

Detailed Criteria

Originality (20%)

Score	Description
5	Highly original; significant new insights or methods
4	Good novelty; clear contribution beyond prior work
3	Some novelty; incremental contribution
2	Limited novelty; mostly replicates existing work
1	No apparent novelty

What Makes a Strong Contribution:

Novel system design or architecture
New insights about LLM capabilities for data tasks
Rigorous comparison revealing unexpected findings
Practical guidelines for practitioners

Technical Quality (25%)

Score	Description
5	Rigorous methodology; comprehensive evaluation; solid results
4	Sound methodology; good evaluation
3	Reasonable approach; evaluation has some gaps
2	Methodology has flaws; evaluation insufficient
1	Fundamentally flawed approach

What Makes Strong Technical Quality:

Clear experimental design
Appropriate metrics for the task
Meaningful baselines
Statistical validity where applicable
Honest discussion of limitations

Clarity (20%)

Score	Description
5	Exceptionally clear; well-organized; engaging
4	Clear writing; good organization
3	Understandable but could be clearer
2	Difficult to follow; organizational issues
1	Incomprehensible

What Makes Strong Clarity:

Logical flow from section to section
Clear topic sentences
Well-labeled figures and tables
Technical concepts explained appropriately
Consistent terminology

Significance (20%)

Score	Description
5	Addresses critical problem; high potential impact
4	Important problem; good potential impact
3	Moderately important; some practical value
2	Limited significance; narrow scope
1	Trivial problem or no clear value

What Makes Strong Significance:

Problem matters to practitioners
Results are actionable
Findings generalize beyond specific dataset
Work opens new research directions

Reproducibility (15%)

Score	Description
5	Fully reproducible; code/data available; detailed methods
4	Mostly reproducible; minor details missing
3	Partially reproducible; some gaps
2	Difficult to reproduce; key details missing
1	Not reproducible

What Makes Strong Reproducibility:

Working code in repository
Clear setup instructions
Data available or describable
Hyperparameters and configurations documented
Random seeds specified

Addressing Draft Feedback

Your final paper should address the feedback from peer reviews of your draft.

How to Respond to Reviews

Categorize feedback - Major issues vs. minor comments
Prioritize - Address common concerns across reviewers first
Revise thoroughly - Don’t just patch; improve the whole section
Be responsive - Address every point raised
Document changes - Know what you changed and why

Example Response to Review

Review Comment: “The evaluation only uses one dataset. How do we know the results generalize?”

Response in Paper:

To evaluate generalization, we conduct additional experiments on two supplementary datasets: the NYC 311 complaint data and the Chicago building permit data. Results (Table 3) show consistent performance across all three datasets, with F1 scores ranging from 0.91 to 0.94.

Paper Sections (Final Version)

Abstract

Your abstract should be a standalone summary that:

States the problem in 1-2 sentences
Describes your approach in 1-2 sentences
Highlights key results with specific numbers
Ends with the broader implication

Introduction

A strong introduction:

Opens with a concrete, motivating example
Clearly states the research question
Positions your work relative to existing approaches
Lists 3-4 specific contributions
Outlines the paper structure

Organize related work into categories:

Traditional data integration approaches
ML-based data engineering methods
LLM applications to data tasks
Your position relative to each category

Methodology

Include:

System architecture diagram
Component descriptions with interfaces
Implementation details (models, libraries, configurations)
Enough detail to reproduce your system

Evaluation

Present:

Experimental Setup - Datasets, metrics, baselines, hardware
Results - Tables and figures with clear captions
Analysis - What worked, what didn’t, and why
Ablations - Which components matter most
Limitations - Honest assessment of weaknesses

Conclusion

A strong conclusion:

Summarizes the main findings (not just lists contributions)
Reflects on what was learned
Acknowledges limitations
Suggests concrete future work

Example Final Results

Complete Results Table

System	Dataset	P	R	F1	Tokens	Cost	Time
Baseline	Transit	0.96	0.94	0.95	-	$0.00	0.2s
Baseline	Utilities	0.94	0.91	0.93	-	$0.00	0.3s
Baseline	311	0.93	0.90	0.91	-	$0.00	0.2s
TransitLLM	Transit	0.94	0.92	0.93	1,240	$0.02	3.8s
TransitLLM	Utilities	0.92	0.90	0.91	1,180	$0.02	4.1s
TransitLLM	311	0.91	0.89	0.90	1,320	$0.02	4.5s

Ablation Study

Configuration	F1	Description
Full system	0.93	All components enabled
No validation	0.88	Remove data quality checks
No schema hints	0.85	Remove schema metadata from prompts
Rule-based mapping	0.79	Replace LLM with string matching

Error Analysis Distribution

Error Type	Count	%	Example
Ambiguous fields	9	64%	“datetime” mapped to wrong timestamp
Nested structures	3	21%	Array not flattened correctly
Format mismatch	2	14%	Date format not detected

Quality Checklist

Content Quality

Research question clearly answered
Contributions are specific and substantiated
Related work positions your contribution
Methodology is reproducible
Evaluation is rigorous with appropriate baselines
Limitations are honestly discussed
Future work is concrete

Writing Quality

Abstract is self-contained and compelling
Introduction hooks the reader
Each section has clear purpose
Paragraphs have topic sentences
Technical terms are defined
Writing is concise and active

Presentation Quality

Technical Quality

Metrics are appropriate for task
Baselines are meaningful
Results include variance where applicable
Error analysis is insightful
Ablations identify key components
Code supports claims

Submission Checklist

Paper addresses draft review feedback
All sections are complete and polished
Results are comprehensive with analysis
Figures and tables are publication-quality
References are complete and consistent
Paper is 8-10 pages
PDF compiles without errors
Code in repository matches paper description
Repository tagged with final

Resources

Project Overview - Full project description
Draft Paper - Previous milestone
Presentation - Final milestone
Paper Rubric - Detailed evaluation criteria

back