CSEDU2026 - LLM-Based Rubric Grading for Programming Assignments: An Empirical Study [Appendix 1]

APPENDICES

A.1 Evaluation Grid

Evaluation rubric used in the assessment process (by both the instructor and the LLM-based system). The rubric is organized into four categories (first column), each associated with a percentage weight. Each category includes one or more specific indicators. For every indicator, a corresponding guiding question is provided (third column), and responses are evaluated according to predefined performance levels.

Category	Indicator	Question	Levels (0–5)
Correctness (weight: 60%)	Functional Completeness	Does the application correctly implement the basic and advanced tasks described in the assignment?	0: Application does not correctly implement the required tasks. 1: Many basic tasks are missing or incorrect. 2: Some basic tasks are incomplete or partially incorrect. 3: All basic tasks are correctly implemented. 4: All basic tasks and most advanced tasks are correctly implemented. 5: All basic and advanced tasks are correctly implemented.
	Robustness	Does the implementation correctly handle edge cases, errors, and abnormal conditions?	0: No handling of edge cases or errors. 1: Error handling almost absent. 2: Minimal or incomplete error handling. 3: Main cases handled but some checks missing. 4: Most edge cases handled correctly. 5: All edge cases and errors identified and handled correctly.
	Design Autonomy	Does the solution include autonomous design choices or improvements not explicitly required by the assignment?	0: Solution copied or nearly identical to examples or other submissions. 1: Mechanical solution replicating examples without meaningful adaptation. 2: Only the minimal required solution with rigid design. 3: Standard approach aligned with the assignment without extensions. 4: At least one autonomous design choice improving the project. 5: Multiple motivated design choices improving quality, flexibility, or robustness.
	Specification Adherence	Were the submission instructions (format, naming, required files, execution environment) respected?	0: Submission not compliant or not evaluable. 1: Severe issues complicating evaluation. 2: Several instructions not respected. 3: Some formal inaccuracies but the project is evaluable. 4: Minor non-critical formal inaccuracies. 5: All submission instructions respected.
Maintainability (weight: 20%)	Modularity	Is the code divided into modules or functions with well-defined responsibilities?	0: Completely monolithic code. 1: Nearly monolithic code. 2: Limited modularity; functions perform multiple tasks. 3: Sufficient modularity but improvable. 4: Good modularity with rare violations of the single-responsibility principle. 5: Highly modular code with clearly separated responsibilities.
	Maintainability	Can the code be extended or modified without introducing significant errors?	0: Code cannot be modified without rewriting. 1: Modifications extremely difficult. 2: Modifications complex and risky. 3: Modifications possible but require care. 4: Modifications possible with moderate effort. 5: Extensions and modifications are simple and safe.
Readability (weight: 10%)	Naming	Are variable, function, and class names clear and meaningful?	0: Names incomprehensible or absent. 1: Misleading or confusing names. 2: Unclear or inconsistent names in several places. 3: Understandable but sometimes generic names. 4: Generally clear names with rare ambiguities. 5: Names always clear, consistent, and self-explanatory.
	Formatting	Do indentation, spacing, and coding style improve readability?	0: No readable formatting. 1: Severely inadequate formatting. 2: Disorganized formatting. 3: Acceptable but inconsistent formatting. 4: Clear formatting with minor imperfections. 5: Impeccable and consistent formatting.
Documentation (weight: 10%)	Documentation	Do comments help understand the structure and behavior of the code?	0: No comments at all. 1: Misleading or unnecessary comments. 2: Rare or not useful comments. 3: Comments present but limited. 4: Generally useful comments. 5: Clear, useful, and well-distributed comments.