Exploring Student Reactions to LLM-Generated Feedback on Explain in Plain English Problems (SIGCSE TS 2025 - Papers)

Who

Chris Kerslake, Paul Denny, David H. Smith IV, Brett Becker, Juho Leinonen, Andrew Luxton-Reilly, Stephen MacNeil

Track

SIGCSE TS 2025 Papers

Time Zone

The program is currently displayed in (GMT-05:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-05:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 27 Feb 2025 16:03 - 16:22 at Meeting Rooms 403-405 - LLMs in CS1 Chair(s): Jake Renzella

Abstract

Code reading and comprehension skills are essential for novices learning programming, and explain-in-plain-English tasks (EiPE) are a well-established approach for assessing these skills. However, manual grading of EiPE tasks is time-consuming and this has limited their use in practice. To address this, we explore an approach where students explain code samples to a large language model (LLM) which generates code based on their explanations. This generated code is then evaluated using test suites, and shown to students along with the test results. We are interested in understanding how automated formative feedback from an LLM guides students’ subsequent prompts towards solving EiPE tasks. We analyzed 177 unique attempts on four EiPE exercises from 21 students, looking at what kinds of mistakes they made and how they fixed them. We found that when students made mistakes, they identified and corrected them using either a combination of the LLM-generated code and test case results, or they switched from describing the purpose of the code to describing the sample code line-by-line until the LLM-generated code exactly matched the obfuscated sample code. Our findings suggest both optimism and caution with the use of LLMs for unmonitored formative feedback. We identified false positive and negative cases, helpful variable naming, and clues of direct code recitation by students. For most students, this approach represents an efficient way to demonstrate and assess their code comprehension skills. However, we also found evidence of misconceptions being reinforced, suggesting the need for further work to identify and guide students more effectively.

Chris Kerslake

Simon Fraser University

Canada

Paul Denny

The University of Auckland

New Zealand

David H. Smith IV

University of Illinois at Urbana-Champaign

United States

Brett Becker

University College Dublin

Ireland

Juho Leinonen

Aalto University

Finland

Andrew Luxton-Reilly