Improving AI in CS1: Leveraging Human Feedback for Better Learning
In 2023, we developed and deployed AI-based tools in CS50 at Harvard University to provide students with 24/7 interactive assistance, approximating a 1:1 teacher-to-student ratio. These tools offer code explanations, style suggestions, and responses to course-related inquiries, emulating human educators to foster critical thinking. However, maintaining alignment with instructional goals is challenging, especially with frequent updates to the underlying large language models (LLMs). We thus propose a continuous improvement process for LLM-based systems using a collaborative human-in-the-loop approach. We introduce a systematic evaluation framework for assessing and refining the performance of AI-based tutors, combining human-graded and model-graded evaluations. Using few-shot prompting and fine-tuning, we aim to ensure our AI tools adopt pedagogically sound teaching styles. Fine-tuning with a small, high-quality dataset has shown significant improvements in aligning with teaching goals, as confirmed through multi-turn conversation evaluations. Additionally, our framework includes a model-evaluation backend that teaching assistants periodically review, ensuring the AI system remains effective and aligned with instructional objectives. This paper offers insights into our methods and the impact of these AI tools on CS50 and contributes to the discourse on AI in education, showcasing scalable, personalized learning enhancements.