loading page

Reinforcement Learning: Playing Tic-Tac-Toe
  • +2
  • Jocelyn Ho ,
  • Jeffrey Huang ,
  • Benjamin Chang ,
  • Allison Liu ,
  • Zoe Liu
Jocelyn Ho
Georgia Institute of Technology

Corresponding Author:[email protected]

Author Profile
Jeffrey Huang
Author Profile
Benjamin Chang
Author Profile
Allison Liu
Author Profile


Machine learning constructs computer systems that develop through experience. Applications surround disciplines in daily life ranging from malware filtering to image recognition. Recent research has shifted towards maximizing efficiency in decision-making, creating algorithms that quickly and accurately process patterns to generate insight. This research focuses on reinforcement learning, a paradigm of machine learning that makes decisions through maximizing reward. Specifically, we use Q-learning – a model-free reinforcement learning algorithm – to assign scores for different decisions given the unique states of the problem. Widyantoro et al. (2009) has studied the effect of Q-learning on learning to play Tic-Tac-Toe. However, the study yielded a win/tie rate of less than 50 percent. We believe that does not represent an effective algorithm to fully exploit the benefits of Q-learning. In the same environment, this research aims to close the gaps in the effectiveness of Q-learning while minimizing human input. Data were processed by setting the epsilon value as 0.9 to ensure randomness, then consecutively decrease with a constant rate as possible states increase. The program played 300,000 games against its previous version, eventually securing a win/tie rate of approximately 90 percent. Future directions include improving the efficiency of Q-learning algorithms and applying the research in practical fields.
08 Mar 2023Published in Journal of Student Research volume 11 issue 3. 10.47611/jsr.v11i3.1739