CAP 5771 Spring 2025 Project FAQ
Frequently Asked Questions
What is the rubric for Milestone 3?
The tentative rubric for Milestone 3 was not supposed to be available earlier today. Please note that the rubric was preliminary and might be subject to changes, but it provided guidance on expectations.
Milestone 3 FAQ: Grading and Expectations
This FAQ provides guidance on how Milestone 3 (Evaluation, Interpretation, Tool Development, and Presentation) will be assessed. Please review these points carefully. Note that these might change as well.
Q1: What is the total point value for Milestone 3 and how is it distributed?
Milestone 3 is worth a total of 250 points, distributed across the deliverables as follows:
- Discussion Post (Presentation & Comments):
- Tool Demo Video:
- Report (PDF):
- Code (GitHub Repository):
Q2: What is the “Self-Explanatory” requirement and why is it important?
This is a crucial requirement for Milestone 3. Each deliverable component (Discussion Post/Presentation, Demo Video, Report, Code/Repo) must be understandable on its own.
- Why? Graders need to assess each part based on the information presented within that part. They should not have to hunt through other deliverables to understand your work (e.g., watch the video to understand the evaluation results described in the report, or read the report to understand what the code does).
- What does it mean?
- Your report should clearly explain your evaluation, interpretation, limitations, and your project’s tool/system without needing the grader to run the code or watch the video for clarification.
- Your demo video should clearly showcase your tool’s functionality and purpose without requiring the viewer to have read the report first.
- Your presentation (in the discussion post) should effectively summarize the project aspects (methodology, findings, tool) without relying heavily on other materials.
- Your code repository (especially the
README.md
and code comments) should provide enough context for someone to understand the structure, purpose, and basic setup/usage of the code.
- Impact: Failure to make a component self-explanatory will result in point deductions for that specific component.
- Presentation Content: Your presentation (posted as part of your discussion thread, approx. 4 minutes/slides) should clearly and concisely summarize your project, methodology, key findings, and provide a demonstration or overview of your tool/system. High marks are given for clarity, accuracy, and how well it stands alone (self-explanatory).
- Peer Comments: Meaningful contributions to the discussion are expected. Provide insightful, constructive comments on your peers’ posts.
- Tool Showcase: The video (approx. 4 minutes) must clearly and effectively demonstrate your project’s primary output or user interaction method (e.g., dashboard features, chatbot query examples, how the recommendation engine is used/results generated). It should be obvious what the tool/system does and how it works just from watching the video.
- Clarity & Professionalism: The video should be well-paced, with clear audio and visuals, and adhere to the time limit.
- README Integration: The video must be correctly linked or embedded in your project’s GitHub
README.md
file.
Q5: What are the key expectations for the Report (PDF)?
The report is a significant component and should be comprehensive and stand alone. Key areas of assessment include:
- Evaluation: Clear presentation and discussion of model performance on the test set using appropriate metrics.
- Interpretation & Insights: Meaningful interpretation of model outputs, explanation of predictions, and derivation of actionable insights.
- Bias & Limitations: Thoughtful discussion of potential biases in your model or data, and limitations of your approach.
- Tool/System Description: _(Updated Clarification)_ This section requires a clear, detailed description of the primary output or interactive component of your project.
- If you built an Interactive Dashboard, describe its purpose, the visualizations, how users interact with it, and key implementation choices.
- If you built a Conversational Agent, describe its capabilities, the types of queries it handles, interaction flow, and implementation details.
- If you built a Recommendation Engine, describe its purpose (what it recommends and based on what data), how it generates predictions/recommendations, how a user would interact with it (e.g., input needed, output format, perhaps via an API call example or script interface), and key implementation choices.
- If you implemented another tool type, describe that tool similarly.
- The key is to clearly explain what you built and how someone uses or interacts with it, making it self-explanatory within the report.
- Structure & Clarity: The report must be well-organized, clearly written, professionally formatted (as a PDF), and easy to follow.
- Group Contribution (If applicable): If you worked in a group, the report must include a clear breakdown of each member’s contributions. Be prepared to justify why the project required a team effort.
Q6: What are the key expectations for the Code (GitHub Repository)?
- Organization & Readability: Code should be well-organized within the repository, follow good coding practices (style conventions, comments/docstrings) for readability, and be understandable.
- Documentation (README): The
README.md
file is critical. It should be informative, providing necessary context, setup instructions (if applicable), requirements, and include the link to your demo video.
- Repository Setup: The repository must follow the specified naming convention (
https://github.com/<username>/cap5771sp25-project
), have the required teaching staff invited as collaborators, and be correctly linked via Gradescope for submission.
Q7: We worked as a group. What do we need to show?
As mentioned for the report (Q5), if you worked in a group, you must include a section in your report detailing the specific contributions of each team member. You should also be able to articulate why the project scope and complexity warranted a two-person team.
We hope this FAQ clarifies the expectations for Milestone 3. Please review your submitted work against these points if you have questions about your assessment.
Should I include my video in my report?
In Milestone 3, your submission (including video) should be in your repo. You can report the video as a link in the documentation.
What do I have to write for Milestone 3?
You should include all the content from the previous milestones all new models and discussion of the tool you built. Consider this an end of semester paper summarizing your work.
Can I change my presentatdation date?
You can make your presentation date earlier. But you cannot switch to a later date.
Can I can my presentation after submitting?
You can edit your presentation before the due date but you must be explicit about the edit. Add in the comments what was edited an why.
Do I need to fix my presentation based on the constructive feedback?
The goal of the peer feedback is to help you have the best final presentation possible.
You are welcomed to updated your tool based on the feedback.
Updating your presentation to make sure your peers give you the best feedback possible is recommended.
Do I need graphics in my presentation?
You will have at least 10 classmates watching your presentation.
Aim to make it entertaining.
You should include figures to support the description of your effort.
What is a dataset license?
Creative Commons licenses (CC0, CC BY, CC BY-SA, CC BY-NC, etc.) and the Open Database License (ODbL) are much more frequently used for datasets. These are tailored more towards factual information, compilations, and content sharing.
What is a dataset license?
Creative Commons licenses (CC0, CC BY, CC BY-SA, CC BY-NC, etc.) and the Open Database License (ODbL) are much more frequently used for datasets. These are tailored more towards factual information, compilations, and content sharing.
My Gradescope submission is causing issues (e.g., crashing, too slow). What should I do?
Gradescope can sometimes struggle with submissions containing a very large number of files. If you experience crashes or excessive loading times when your submission is being graded (like what occurred for some during Milestone 2 grading):
- Review your uploaded files and remove any unnecessary ones. For example, if you have numerous intermediate files, logs, or extensive raw outputs that aren’t essential for grading, consider removing them or significantly reducing their number (e.g., aim to reduce file count if it’s excessively large).
- Ensure your core code, documentation, models, and essential results demonstrating your work are still present.
- Troubleshooting steps like changing browsers or clearing cache might help but are unlikely to resolve issues caused by excessive file counts.
- If problems persist after significantly reducing the file count, please contact the TA or instructor.
What if I disagree with my grade or need a regrade on milestones?
If you are not satisfied with your grade, please know that grading aims to be fair, and Nanjie (Jimmy) Rao have already applied lots of leniency. If you still believe there is a specific error based on the rubric, you must follow the formal regrade request procedure outlined in the syllabus:
Please be aware of the policy implications. As noted by Dr. Grant regarding the formal process: submitting a formal request means your work might be reviewed more strictly by the professors. This carries the risk of potentially losing additional points if other issues are found during the stricter review. Submit a formal request only if you are confident there is a specific error in the grading according to the rubric.
How should I cite/document the source of my datasets?
Properly documenting your data sources is crucial. In your report:
- Provide a Direct Link (URL): Include a working URL for each dataset used. This could point to:
- The Kaggle dataset page.
- The Hugging Face dataset page.
- A government data portal page.
- A direct download link if provided by the source (like NASA or a specific research group).
- The website where the data originates (e.g., datahub.io, Our World In Data).
- An API endpoint documentation page (e.g., Open-Meteo API docs).
- State the Source Clearly: Name the platform or organization providing the data (e.g., “Sourced from Kaggle,” “Obtained from the U.S. Census Bureau,” “Accessed via the Open-Meteo API”).
What are common software/code licenses for projects?
For code developed in your project (if applicable), common open-source licenses include:
- MIT: Very permissive, allows reuse with few restrictions.
- Apache 2.0: Permissive, includes clauses on patent rights.
- GPL (v2, v3): “Copyleft” licenses requiring derivative works using the code to also be GPL.
- LGPL: A weaker copyleft, often used for libraries, allowing linking with non-GPL code under certain conditions.
- BSD: Permissive licenses with variations (2-clause, 3-clause).
These licenses often contain clauses specific to software distribution, modification, and patent rights. You can find more information on choosing and applying a license here:
What are common licenses for datasets?
Datasets often use licenses tailored for data and content sharing, which differ from typical software licenses. Common examples include:
- Creative Commons (CC): A suite of licenses like:
- CC0: Public domain dedication (effectively no restrictions).
- CC BY: Requires attribution.
- CC BY-SA: Requires attribution and derivative datasets must be shared under the same license.
- CC BY-NC: Requires attribution and prohibits commercial use.
- (And combinations like CC BY-NC-SA).
- Open Database License (ODbL): Specifically designed for databases, requiring attribution and share-alike for the database structure and contents.
These licenses focus more on the sharing, use, and adaptation of factual information, compilations, and creative content. Always check the specific license terms for any dataset you use.
Back to BACK