Evaluating GPT for use in K-12 Block Based CS Instruction Using a Transpiler and Prompt Engineering (SIGCSE TS 2025 - Papers)

Who

David Gonzalez-Maldonado, Jonathan Liu, Diana Franklin

Track

SIGCSE TS 2025 Papers

Time Zone

The program is currently displayed in (GMT-05:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-05:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 28 Feb 2025 16:41 - 17:00 at Meeting Rooms 306-307 - K-12 Instruction Chair(s): Zihan Wu

Abstract

Though the increased availability of Large Language Models (LLMs) presents significant potential for change in the way students learn to program, the text-based nature of the available tools preclude block-based languages from much of that innovation. In an attempt to remedy this, we identify the strengths and weaknesses of using a transpiler to leverage the existing learning in commercially available LLMs and Scratch, a visual block-based programming language.

We evaluate an LLM’s performance on two common classroom tasks in a Scratch curriculum using only prompt engineering. We evaluate the LLM’s ability to: 1) create project solutions that compile and satisfy project requirements and 2) analyze student projects’ completion of project requirements.

In both cases, we find results indicating that prompt-engineering alone is insufficient to reliably produce high-quality results. For projects of medium complexity, the LLM-generated solutions consistently failed to follow correct syntax or, in the few instances with correct syntax, produce correct solutions. When used for auto-grading, we found a correlation between scores assigned by the autograder and those generated by the LLM, but the discrepancies between the `real’ scores and the scores assigned by the LLM remained too great for the tool to be reliable in a classroom setting.

David Gonzalez-Maldonado

University of Chicago

United States

Jonathan Liu

University of Chicago

United States

Diana Franklin

University of Chicago

United States

Time Zone

The program is currently displayed in (GMT-05:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-05:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Fri 28 Feb
Displayed time zone: Eastern Time (US & Canada) change

15:45 - 17:00	K-12 InstructionPapers at Meeting Rooms 306-307 Chair(s): Zihan Wu University of Michigan

15:45 18m Talk		Bridging Disciplines: Integrating Computer Science and Social Studies in Rural Middle SchoolsK12 Papers Debra Bernstein TERC, Eric Hochberg TERC, Santiago Gasca TERC, Michael Berson University of South Florida, Kristen Franklin CodeVA, Perry Shank CodeVA File Attached
16:03 18m Talk		CS Concepts and Contextual Factors in Integrated Computing Activities in U.S. SchoolsK12Best Paper Papers Marya Rahimi Georgia State University, Lauren Margulieux Georgia State University, Erin Anderson Georgia State University
16:22 18m Talk		RAD: A Framework to Support Youth in Critiquing AIK12 Papers Jaemarie Solyst Carnegie Mellon University, Emily Amspoker Carnegie Mellon University, Ellia Yang Carnegie Mellon University, Motahhare Eslami Carnegie Mellon University, Jessica Hammer Carnegie Mellon University, Amy Ogan Carnegie Mellon University
16:41 18m Talk		Evaluating GPT for use in K-12 Block Based CS Instruction Using a Transpiler and Prompt EngineeringK12 Papers David Gonzalez-Maldonado University of Chicago, Jonathan Liu University of Chicago, Diana Franklin University of Chicago