Using ChatGPT to Annotate a Dataset: A Case Study in Intelligent
Tutoring Systems
Abstract
Large language models like ChatGPT can learn in-context (ICL) from
examples. Studies showed that, due to ICL, ChatGPT achieves impressive
performance in various natural language processing tasks. However, to
the best of our knowledge, this is the first study that assesses
ChatGPT’s effectiveness in annotating a dataset for training instructor
models in intelligent tutoring systems (ITSs). The task of an ITS’s
instructor model is to mimic the human instructor by providing an
effective tutoring action for a given student’s state. The instructor
models are typically implemented as hardcoded rules, limiting their
ability to personalize instruction. This problem could be mitigated by
utilizing machine learning (ML). However, training supervised ML models
requires a large dataset of student states annotated by corresponding
tutoring actions. Using human experts to annotate such datasets is
expensive, time-consuming, and requires pedagogical expertise. Thus,
this study explores ChatGPT’s potential to act as a pedagogy expert
annotator. Using prompt engineering, we created a list of actions a
tutor could recommend to a student. We manually filtered this list and
instructed ChatGPT to select the appropriate action from the list for
the given student’s state. We manually analyzed ChatGPT’s responses that
could be considered incorrect labels. Our results indicate that using
ChatGPT as an annotator is an effective alternative to human experts.
The contributions of our work are (1) a novel dataset annotation
methodology for the ITS context, (2) a publicly available dataset of
student states annotated with tutoring advice, and (3) a list of
possible pedagogical actions.