An LLM-Based Framework for Simulating, Classifying, and Correcting Students’ Programming Knowledge with the SOLO Taxonomy
This program is tentative and subject to change.
Novice programmers often face challenges in designing computational artifacts and fixing code errors, which can lead to task abandonment and over-reliance on external support. While research has explored effective meta-cognitive strategies to scaffold novice programmers’ learning, it is essential to first understand and assess students’ conceptual, procedural, and strategic/conditional programming knowledge at scale. To address this issue, we propose a three-model framework that leverages Large Language Models (LLMs) to simulate, classify, and correct student responses to programming questions based on the SOLO Taxonomy. The SOLO Taxonomy provides a structured approach for categorizing student understanding into four levels: Pre-structural, Uni-structural, Multi-structural, and Relational. Our results showed that GPT-4o achieved high accuracy in generating and classifying responses for the Relational category, with moderate accuracy in the Uni-structural and Pre-structural categories, but struggled with the Multi-structural category. Model three successfully corrected responses to the Relational level. Although further refinement is needed, these findings suggest that LLMs hold significant potential for supporting computer science education by assessing programming knowledge and guiding students toward deeper cognitive engagement.