Reinforcement Learning_ Playing Tic-Tac-Toe (Pre-Print).pdf (235.29 kB)
Download file

Reinforcement Learning: Playing Tic-Tac-Toe

Download (235.29 kB)
posted on 2022-08-08, 05:06 authored by Jocelyn HoJocelyn Ho, Jeffrey Huang, Benjamin ChangBenjamin Chang, Allison Liu, Zoe Liu

Machine learning constructs computer systems that develop through experience. Applications surround disciplines in daily life ranging from malware filtering to image recognition. Recent research has shifted towards maximizing efficiency in decision-making, creating algorithms that quickly and accurately process patterns to generate insight. This research focuses on reinforcement learning, a paradigm of machine learning that makes decisions through maximizing reward. Specifically, we use Q-learning – a model-free reinforcement learning algorithm – to assign scores for different decisions given the unique states of the problem. Widyantoro et al. (2009) has studied the effect of Q-learning on learning to play Tic-Tac-Toe. However, the study yielded a win/tie rate of less than 50 percent. We believe that does not represent an effective algorithm to fully exploit the benefits of Q-learning. In the same environment, this research aims to close the gaps in the effectiveness of Q-learning while minimizing human input. Data were processed by setting the epsilon value as 0.9 to ensure randomness, then consecutively decrease with a constant rate as possible states increase. The program played 300,000 games against its previous version, eventually securing a win/tie rate of approximately 90 percent. Future directions include improving the efficiency of Q-learning algorithms and applying the research in practical fields.


Email Address of Submitting Author

Submitting Author's Institution

Georgia Institute of Technology

Submitting Author's Country

  • United States of America

Usage metrics