Abstract
Next word prediction is an input technology that simplifies the process
of typing by suggesting the next word to a user to select, as typing in
a conversation consumes time. A few previous studies have focused on the
Kurdish language, including the use of next word prediction. However,
the lack of a Kurdish text corpus presents a challenge. Moreover, the
lack of a sufficient number of N-grams for the Kurdish language, for
instance, five grams, is the reason for the rare use of next Kurdish
word prediction. Furthermore, the improper display of several Kurdish
letters in the Rstudio software is another problem. This paper provides
a Kurdish corpus, creates five, and presents a unique research work on
next word prediction for Kurdish Sorani and Kurmanji. The N-gram model
has been used for next word prediction to reduce the amount of time
while typing in the Kurdish language. In addition, little work has been
conducted on next Kurdish word prediction; thus, the N-gram model is
utilized to suggest text accurately. To do so, R programming and RStudio
are used to build the application. The model is 96.3% accurate.