CIS 6930 Spring 26

Logo

Data Engineering at the University of Florida

Final Paper

Due: Monday, April 13, 2026 at 11:59 PM Points: 400 Submission: Push to cis6930sp26-project repository in paper/ directory with tag final


Overview

The final paper is the primary deliverable for your course project. It should be a polished research paper that presents your work with rigorous evaluation. This paper represents 40% of your total project grade.

Deliverables

Submit a complete, polished paper (8-10 pages) that:

  1. Addresses feedback from draft peer reviews
  2. Presents complete experimental results
  3. Includes thorough analysis and discussion
  4. Meets publication-quality standards

File Structure

cis6930sp26-project/
├── paper/
│   ├── paper.pdf          # Compiled paper
│   ├── paper.tex          # Source (or paper.md)
│   ├── figures/
│   └── references.bib
└── ...

Tagging Your Submission

git tag -a final -m "Final paper submission"
git push origin final

Rubric

The final paper is evaluated using conference-style peer review criteria.

Criterion Weight Points Description
Originality 20% 80 Does the paper make a novel contribution?
Technical Quality 25% 100 Is the methodology sound and evaluation rigorous?
Clarity 20% 80 Is the paper well-written and easy to understand?
Significance 20% 80 Does this work address an important problem?
Reproducibility 15% 60 Can the results be reproduced by others?
Total 100% 400  

Scoring Scale

Score Meaning Conference Equivalent
5 Excellent Strong Accept
4 Good Accept
3 Satisfactory Weak Accept / Borderline
2 Below Average Weak Reject
1 Poor Reject

Detailed Criteria

Originality (20%)

Score Description
5 Highly original; significant new insights or methods
4 Good novelty; clear contribution beyond prior work
3 Some novelty; incremental contribution
2 Limited novelty; mostly replicates existing work
1 No apparent novelty

What Makes a Strong Contribution:

Technical Quality (25%)

Score Description
5 Rigorous methodology; comprehensive evaluation; solid results
4 Sound methodology; good evaluation
3 Reasonable approach; evaluation has some gaps
2 Methodology has flaws; evaluation insufficient
1 Fundamentally flawed approach

What Makes Strong Technical Quality:

Clarity (20%)

Score Description
5 Exceptionally clear; well-organized; engaging
4 Clear writing; good organization
3 Understandable but could be clearer
2 Difficult to follow; organizational issues
1 Incomprehensible

What Makes Strong Clarity:

Significance (20%)

Score Description
5 Addresses critical problem; high potential impact
4 Important problem; good potential impact
3 Moderately important; some practical value
2 Limited significance; narrow scope
1 Trivial problem or no clear value

What Makes Strong Significance:

Reproducibility (15%)

Score Description
5 Fully reproducible; code/data available; detailed methods
4 Mostly reproducible; minor details missing
3 Partially reproducible; some gaps
2 Difficult to reproduce; key details missing
1 Not reproducible

What Makes Strong Reproducibility:


Addressing Draft Feedback

Your final paper should address the feedback from peer reviews of your draft.

How to Respond to Reviews

  1. Categorize feedback - Major issues vs. minor comments
  2. Prioritize - Address common concerns across reviewers first
  3. Revise thoroughly - Don’t just patch; improve the whole section
  4. Be responsive - Address every point raised
  5. Document changes - Know what you changed and why

Example Response to Review

Review Comment: “The evaluation only uses one dataset. How do we know the results generalize?”

Response in Paper:

To evaluate generalization, we conduct additional experiments on two supplementary datasets: the NYC 311 complaint data and the Chicago building permit data. Results (Table 3) show consistent performance across all three datasets, with F1 scores ranging from 0.91 to 0.94.


Paper Sections (Final Version)

Abstract

Your abstract should be a standalone summary that:

Introduction

A strong introduction:

Organize related work into categories:

Methodology

Include:

Evaluation

Present:

Conclusion

A strong conclusion:


Example Final Results

Complete Results Table

System Dataset P R F1 Tokens Cost Time
Baseline Transit 0.96 0.94 0.95 - $0.00 0.2s
Baseline Utilities 0.94 0.91 0.93 - $0.00 0.3s
Baseline 311 0.93 0.90 0.91 - $0.00 0.2s
TransitLLM Transit 0.94 0.92 0.93 1,240 $0.02 3.8s
TransitLLM Utilities 0.92 0.90 0.91 1,180 $0.02 4.1s
TransitLLM 311 0.91 0.89 0.90 1,320 $0.02 4.5s

Ablation Study

Configuration F1 Description
Full system 0.93 All components enabled
No validation 0.88 Remove data quality checks
No schema hints 0.85 Remove schema metadata from prompts
Rule-based mapping 0.79 Replace LLM with string matching

Error Analysis Distribution

Error Type Count % Example
Ambiguous fields 9 64% “datetime” mapped to wrong timestamp
Nested structures 3 21% Array not flattened correctly
Format mismatch 2 14% Date format not detected

Quality Checklist

Content Quality

Writing Quality

Presentation Quality

Technical Quality


Submission Checklist


Resources


back