Quantitative Evaluation of using Large Language Models and Retrieval-Augmented Generation in Computer Science Education (SIGCSE TS 2025 - Papers)

Who

Kevin Shukang Wang, Ramon Lawrence

Track

SIGCSE TS 2025 Papers

Time Zone

The program is currently displayed in (GMT-05:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-05:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Sat 1 Mar 2025 11:41 - 12:00 at Meeting Room 407 - Systems and Security Courses Chair(s): Matthew Barr

Abstract

Generative artificial intelligence (GenAI) is transforming Computer Science education, and every instructor is reflecting on how AI will impact their courses. Instructors must determine how students may use AI for course activities and what AI systems they will support and encourage students to use. This task is challenging with the proliferation of large language models (LLMs) and related AI systems. The contribution of this work is an experimental evaluation of the performance of multiple open-source and commercial LLMs utilizing retrieval-augmented generation in answering questions for computer science courses and a cost-benefit analysis for instructors when determining what systems to use. A key factor is the time an instructor has to maintain their supported AI systems and the most effective activities for improving their performance. The paper offers recommendations for deploying, using, and enhancing AI in educational settings.

Kevin Shukang Wang

The University of British Columbia

Canada

Ramon Lawrence