In this post I would like to show you how AI can be used to teach a computer to play Tic-Tac-Toe… not the way it should be done

The game we’re dealing with is not very complicated to say the least and even if it was more challenging there are specialised methods just waiting to be used in such situations. So rolling out a neural approach here is like cracking a nut with a sledgehammer… that being said let’s get down to business!

First of all an artificial neural network is a biologically inspired computational model basically trying to mimic the way a real live brain works. When somebody approaches you in a dark corner and asks “Pssst! Wan’na see a neural network?” then he most probably will show you something like this:

.

Such a network is actually a multilayer perceptron with the three layers of neurons (circles) connected by weighted links (arrows). It works by receiving some inputs delivered to the input layer, processing them using the neurons and returning output through the output layer.

Initially such a network is pretty useless until its trained. Training basically means trying to figure out how the weight values should be set. So how do we teach one to play tic-tac-toe?

Let’s start from the beginning and define an empty board of a given size and write a function (in R) which evaluates whether somebody has won:

board.size <- 3 # The game board is a matrix with: 1 - tic, 0 - empty, -1 - tac GenerateEmptyBoard <- function(){ matrix(0, ncol=board.size, nrow=board.size) } DisplayBoard <- function(board){ b <- factor(board, levels=c(-1,0,1),labels=c("X","*","O")) dim(b) <- dim(board) b } FlipMatrix <- function(m){ apply(m, 2, rev) } # Checking whether somebody has won EvaluateBoard <- function(board){ sums <- c(colSums(board), rowSums(board), sum(diag(board)), sum(diag(FlipMatrix(board))) ) if(max(sums) == board.size){ return(1) } if(min(sums) == -board.size){ return(-1) } 0 }

Now let’s create functions for our neural network to let it play the game. A reasonable approach is to assume that the network will be taught to evaluate the board situation. Is such case when the computer will be making its move it will use the neural network to evaluate each possible move and choose the best one.

neurons <- 1 # Creating an empty neural network which we represent as a matrix of weights InitNN <- function(){ n.weights <- neurons*board.size^2 + 2*neurons matrix(runif(n.weights),nrow=neurons) } # Calculating the network RunNN <- function(nn, board){ w.out <- nn[,1] w0.in <- nn[,2] w.in <- nn[,3:ncol(nn)] t(w.out) %*% tanh(w.in %*% as.vector(board) + w0.in) } # Evaluating every move possible using the network RunAI <- function(ai, board, side){ res <- sapply(1:length(board), function(i){ b <- board b[[i]] <- side RunNN(ai,b) }) # We don't want the AI to cheat res[board!=0] = -Inf # We choose the best one which.max(res) } Move <- function(ai, board, side){ move <- RunAI(ai, board, side) board[[move]] <- side board } IsMovePossible <- function(board){ length(which(board==0)) > 0 }

So now we have all we need to make a neural network play tic-tac-toe, but it won’t be very good at it – until we teach it. One way to do it it to sit in front of the computer for a couple of weeks, play a game after game against against our pet network until it is worthy… or: behold the Random Player!

NNvsRandomPlayer <- function(tic.ai){ side <- sample(c(-1,1),1) board <- GenerateEmptyBoard() eval <- 0 while(eval == 0 && IsMovePossible(board)){ if(side==1){ board <- Move(tic.ai, board, side) } else{ # Make a valid move completely at random and see what happens move <- sample(which(board==0),1) board[[move]] <- side } eval <- EvaluateBoard(board) side <- side * -1 } eval }

So now we have a worthy opponent let’s see how our network does in a series of 10,000 games:

Hmm… not very good, but let’s train it:

TrainAI <- function(){ # Function to evaluate how good is our network Eval <- function(w){ tic.ai <- matrix(w, nrow=neurons) # Playing against a random player is nondeterministic, so we need to stabilise the results ev <- median(sapply(1:20,function(j)mean(sapply(1:20, function(i)NNvsRandomPlayer(tic.ai))))) ev <- -1*(ev) } len <- length(InitAI()) # This is a global optimisation method, so using we need an appropriate method - a differential evolution # algorithm seems sufficient res <- DEoptim::DEoptim(Eval, rep(-0.1,len), rep(0.1,len), DEoptim::DEoptim.control(trace=1, parallelType=0, NP=10, VTR=-1.0)) matrix(res$optim$bestmem, nrow=neurons) }

After the training our impressive single-hidden-neuron-network becomes better (again 10,000 games):

It works! On the other hand you could probably implement a better “AI engine” for tic-tac-toe using a couple of `if`

‘s but that’s a whole different story…

You need roughly 2 ifs for a better tic-tac-toe AI.

I’m one of the sad saps that sat down and worked out every possible combination (excluding rotations)

When I go first, and remember my findings correctly, I can either win or draw, there is no lose.

Right you are. However the described approach in principle could be used to train the networks to play more challanging games, ones which

would require many more ‘if’s’.

If I find the time I’ll make a more advanced example.

+1 also ANN are good at finding complex patterns that are not easily described (NLF,etc). Good write up

I was just wondering, what would the result be like, if we train the ANN AI using the “better” AI engine, and then pit it against 1)the random player, 2)the “better” AI engine. I would suppose the win rate in (1) would be higher as that presented in this post, while there would be more ties in (2)

Probably it would be higher. One also could try the whole approach iteratively – in the first iteration train a neural network against the random player, and in subseqent iterations train against the networks from the previous ones.