Using ChatGPT to Annotate a Dataset: A Case Study in Intelligent Tutoring Systems

Aleksandar Vujinović; Nikola Luburić; Jelena Slivka; Aleksandar Kovačević

doi:10.36227/techrxiv.23617551.v1

loading page

Using ChatGPT to Annotate a Dataset: A Case Study in Intelligent Tutoring Systems

Aleksandar Vujinović ,
Nikola Luburić ,
Jelena Slivka ,
Aleksandar Kovačević

Abstract

Large language models like ChatGPT can learn in-context (ICL) from examples. Studies showed that, due to ICL, ChatGPT achieves impressive performance in various natural language processing tasks. However, to the best of our knowledge, this is the first study that assesses ChatGPT’s effectiveness in annotating a dataset for training instructor models in intelligent tutoring systems (ITSs). The task of an ITS’s instructor model is to mimic the human instructor by providing an effective tutoring action for a given student’s state. The instructor models are typically implemented as hardcoded rules, limiting their ability to personalize instruction. This problem could be mitigated by utilizing machine learning (ML). However, training supervised ML models requires a large dataset of student states annotated by corresponding tutoring actions. Using human experts to annotate such datasets is expensive, time-consuming, and requires pedagogical expertise. Thus, this study explores ChatGPT’s potential to act as a pedagogy expert annotator. Using prompt engineering, we created a list of actions a tutor could recommend to a student. We manually filtered this list and instructed ChatGPT to select the appropriate action from the list for the given student’s state. We manually analyzed ChatGPT’s responses that could be considered incorrect labels. Our results indicate that using ChatGPT as an annotator is an effective alternative to human experts. The contributions of our work are (1) a novel dataset annotation methodology for the ITS context, (2) a publicly available dataset of student states annotated with tutoring advice, and (3) a list of possible pedagogical actions.