Feedback Quality Rubric

This rubric is used to evaluate the quality of peer reviews you provide. Good reviews help your classmates improve and demonstrate your understanding of the material.

Purpose

Your reviews are graded on quality, not just completion. This rubric helps you understand what makes a helpful review and how your review quality score is calculated.

Scoring Scale

Score	Meaning	Impact
5	Exemplary	You provided exceptional feedback that clearly helps improvement
4	Good	Your feedback was helpful and constructive
3	Adequate	Your feedback was acceptable but could be more useful
2	Insufficient	Your feedback was too brief, vague, or unhelpful
1	Poor	Your feedback was missing, inappropriate, or harmful

Criteria

1. Evidence of Engagement (25%)

Did you actually run and test the code?

Score	Description
5	Clear evidence of thorough testing: mentions specific commands run, outputs observed, errors encountered
4	Evidence of running the code and tests; references specific behavior
3	Appears to have run the code but details are vague
2	Unclear if code was actually executed; generic observations
1	Obviously did not run the code; review could apply to any submission

Good Example:

“When I ran uv run python -m assignment0 --source weather --location 'Miami', the API returned data successfully, but the LLM summary was just the raw JSON reformatted. The --help output was clear and showed all available options.”

Bad Example:

“The code seems to work.”

2. Specificity (25%)

Does your feedback reference specific parts of the submission?

Score	Description
5	References specific files, line numbers, function names, outputs; provides exact examples
4	Good specificity with references to particular code sections or behaviors
3	Some specific references but also vague generalizations
2	Mostly general comments; few specific references
1	Entirely generic; could apply to any submission

Good Example:

“In api.py:45, the fetch_data() function doesn’t have a timeout parameter, which could cause the program to hang indefinitely if the API is slow. Consider adding timeout=30 to the requests.get() call.”

Bad Example:

“Error handling could be improved.”

3. Actionable Suggestions (25%)

Does your feedback provide clear paths for improvement?

Score	Description
5	Every criticism includes a concrete suggestion for improvement; provides code examples when helpful
4	Most criticisms include suggestions; suggestions are practical
3	Some suggestions provided but not always actionable
2	Few suggestions; mostly just identifying problems
1	No constructive suggestions; only complaints

Good Example:

“The CLI argument --src is unclear. Consider renaming it to --source or --api to better describe what it does. You could update line 23 in cli.py: change parser.add_argument('--src', ...) to parser.add_argument('--source', ...).”

Bad Example:

“The CLI arguments are confusing.”

4. Balance and Fairness (15%)

Does your review acknowledge both strengths and areas for improvement?

Score	Description
5	Thoughtful balance; acknowledges what works well while constructively noting improvements
4	Good balance between positive feedback and suggestions
3	Imbalanced but covers both strengths and weaknesses
2	Almost entirely positive or entirely negative
1	One-sided; either dismissive or only praise with no substance

Good Example:

“The LLM integration is well-implemented with proper error handling for API failures. The summary output is insightful and adds value beyond the raw data. One area for improvement: the tests currently call the real API, which makes them slow and dependent on network access. Consider using unittest.mock to mock the API responses.”

Bad Example:

“Everything looks great! Good job!” (no substance) “This code has many problems.” (no acknowledgment of effort)

5. Professionalism (10%)

Is your review respectful and appropriate?

Score	Description
5	Highly professional; encouraging while honest; focuses on the work not the person
4	Professional and respectful throughout
3	Mostly professional with minor tone issues
2	Some dismissive or condescending language
1	Rude, personal, or inappropriate comments

Good Example:

“The project shows strong understanding of API integration. The error handling could be more robust - consider wrapping the API call in a try-except block to catch network errors gracefully.”

Bad Example:

“I don’t know why you did it this way. This is the wrong approach.”

Review Quality Score Calculation

Your review quality contributes to the “Review Quality” portion of your course grade (10%).

For each review you submit:

Component	Weight	How It’s Measured
Rubric Criteria Above	70%	Instructor/TA evaluation of your review
Timeliness	15%	Submitted by deadline
Completeness	15%	All sections of review template filled out

Overall Review Quality Score:

Average of your individual review scores across all assignments
Peer meta-reviews (if applicable) may also factor in

Self-Check Before Submitting

Before you submit your review, verify:

I cloned and ran the repository
I ran the tests with uv run pytest -v
I checked the GitHub Actions status
I referenced specific files/lines/outputs in my feedback
Every criticism includes a suggestion for improvement
I acknowledged at least 2-3 things done well
My feedback is respectful and constructive
I filled out all sections of the review template
My review would help ME if I received it

Examples

Exemplary Review (Score: 5)

API Data Collection (5/5): The weather API integration works well. I ran uv run python -m assignment0 --source weather --location "Gainesville" and received properly formatted data. The code in api.py handles HTTP errors with a try-except block and includes a 30-second timeout. Nice touch adding retry logic on line 52.

Suggestion: Consider adding a rate limiter if someone runs many queries in succession.

LLM Processing (4/5): The summary output is useful - it identified the key weather patterns and provided a 3-day outlook. The prompt in llm.py:28 is well-crafted with clear instructions.

Suggestion: The summary could be even better with structured output. Consider asking the LLM to return JSON with specific fields (high_temp, low_temp, conditions) that you can then format nicely.

Testing (3/5): Tests pass locally and in CI. However, test_api.py calls the real weather API, making it slow and potentially flaky.

Suggestion: Use unittest.mock.patch to mock requests.get in your tests. Here’s an example:
@patch('assignment0.api.requests.get')
def test_fetch_weather(mock_get):
    mock_get.return_value.json.return_value = {"temp": 72}
    result = fetch_weather("Miami")
    assert result["temp"] == 72

Poor Review (Score: 1-2)

“Code works. Tests pass. Could be better documented.”

This review fails because:

No evidence of actually running the code
No specific references
No actionable suggestions
No acknowledgment of strengths
Too brief to be useful

Common Feedback Mistakes

Mistake	Why It’s a Problem	Better Approach
“LGTM” / “Looks good”	No substance; not helpful	Explain what specifically works well
“This is wrong”	No explanation or alternative	Explain why and suggest a fix
“You should know better”	Personal attack	Focus on the code, not the person
Not running the code	Miss actual bugs	Always clone, run, and test
Copy-paste generic feedback	Obvious; not helpful	Engage with the specific submission

Resources

back