csc494-report-v4.pdf (909.35 kB)
Download fileQuestion Answering Model in Thai by using Squad Thai Wikipedia dataset
preprint
posted on 2021-12-14, 23:49 authored by Wicharn RueangkhajornWicharn Rueangkhajorn, Jonathan H. ChanJonathan H. ChanNowadays, Question Answering is one of the
challenge applications in the Natural language processing
domain. There are plenty of English language Question
Answering model distributed on the model sharing website such
as Hugging face hub. Unlike Thai language, there is on a few
Thai language Question Answering model distributed on the
model sharing website. So, we decided to fine-tune a
multilingual Question Answering model to a specify language
which is Thai language. The datasets that we will use for
training is a Thai Wikipedia dataset from iApp Technology. We
have tried to fine-tune on two multilingual model. We also create
another dataset to evaluate adaptivity of the model. The result
came out to be as satisfy. Both fine-tuned models perform better
than base model on evaluation score. We have published
Question Answering model to Hugging face hub that will allow
people to using these models for others application later.