## Python Connect4 AI with MTD(f) algorithm My favorite three games to implement when learning a new programming are: TicTacToe, Connect Four, and Minesweeper. Last month, I wrote a post about implementing TicTacToe in Python. This time I implemented “Connect Four” (also known as “Four in a Row” and “Four in a Line”). When implementing some AI for TicTacToe I chose the basic “minimax” algorithm without any optimization. This time I went a little further and improved the minimax algorithm by adding a “horizon”, “alpha-beta pruning”, a “null-window”, “iterative deepening”, and a “transposition table” turning the minimax into a “Memory-enhanced Test Driver” also known as: MTD(f).

MTD(f) = minimax + horizon + alpha-beta pruning + null-window + iterative deepening + transposition table

MTD(f), is a higly optimized minimax algorithm. It was developed in 1994 by Aske Plaat, Jonathan Schaeffer, Wim Pijls, and Arie de Bruin.  The name MTD(f) is an abbreviation for MTD(n,f) (Memory-enhanced Test Driver with node n and value f). It is a variant of minimax and an alternative to the alpha-beta pruning algorithm.

## Minimax

The minimax theorem states (source: Wikipedia)

For every two-person, zero-sum game with finitely many strategies, there exists a value V and a mixed strategy for each player, such that:

(a) Given player 2’s strategy, the best payoff possible for player 1 is V, and
(b) Given player 1’s strategy, the best payoff possible for player 2 is −V. A game with just two possible moves on each turn will have a search tree as shown in the above picture. The circles represent possible moves by the current player (max), while the squares represent the opponents moves (min).

## Horizon

One could limit the tree to a look-ahead of four moves to limit the time the AI is “thinking”. This is done in the tree above. This is called the “horizon effect”, since the algorithm will not look any further. On the 4th level a “heuristic” function will be used to score the nodes. The better the heuristic function, the better the AI will play. Values for won and lost games are typically +infinity and -infinity, but in MTD(f) they should probably be a valid integer, like +1000 and -1000.

## Alpha-beta pruning

Using alpha-beta pruning one can reduce that amount of nodes that have to be visited by the minimax algorithm by keep track of the best and worst values. If it finds a move to be worse than a previously visited move it will not evaluate that move any further. Effectively the algorithm “prunes” away the branches of the search tree that do not influence the final decision. Below is an example of a search tree in which the moves are evaluated from left to right. The red marks are where the branches are pruned away, so that the gray branches are not evaluated. ## Null window

The concept of a “null window” is also known as a “zero”, “scout” or “narrow” window. This is a way to reduce the search space by performing a series of boolean tests, thereby searching for a move that produces a worse or better score. With integers a null-window is defined as alpha == beta – 1. Note that a null window search does not return an exact score, but either an upper bound with score < beta (score <= alpha) or a lower bound with score >= beta (score > alpha). It is kind of hard to explain and I believe that if you want to truly understand null window you should read the code.

## Iterative deepening

Iterative deepening means that we limit the search time to some fixed amount and then we evaluate the nodes in the search tree. We start evaluating with a horizon or look ahead of one move, then we look ahead for two moves, then three, etc. Whenever we hit the time limit we continue the search with the current look ahead value, but stop afterwards. This makes the AI “thinking” time more or less constant and the look ahead variable, but dependent on the (remaining) size of the tree. This way we avoid that the AI is very slow for the first moves (when the search tree is still large) and very fast at the end (when there are not many possible moves left).

## Transposition table

A transposition table stores the outcome (upper and lower bound scores) of the evaluation of a board state in a hash table. This allows to reduce the search space if a single board state can be reached with different sequences of moves. The transposition table also stores the mirrored board state, since these are supposed to be equally scored. This may greatly reduce the size of the search tree in the early stage of the game.

## Connect Four Python source ```          # coding=UTF8
from Tkinter import Tk, Button, Frame, Canvas
from tkFont import Font
from copy import deepcopy
from time import time

class Board:

nodes = {}

def __init__(self,other=None):
self.player = 'X'
self.opponent = 'O'
self.empty = '.'
self.width = 7
self.height = 6
self.fields = {}
for y in range(self.height):
for x in range(self.width):
self.fields[x,y] = self.empty
# copy constructor
if other:
self.__dict__ = deepcopy(other.__dict__)

def move(self,x):
board = Board(self)
for y in range(board.height):
if board.fields[x,y] == board.empty:
board.fields[x,y] = board.player
break
board.player,board.opponent = board.opponent,board.player
return board

def __heuristic(self):
return self.__heuristic_score(self.player)-self.__heuristic_score(self.opponent)

def __heuristic_score(self, player):
lines = self.__winlines(player)
winpositions = self.__winpositions(lines,player)
score = 0
for x in range(self.width):
for y in range(self.height-1,0,-1):
win = winpositions.get("{0},{1}".format(x,y),False)
below = winpositions.get("{0},{1}".format(x,y-1),False)
if win and below:
score+=self.height-y*100
for line in lines:
pieces = 0
height = []
for x,y in line:
if self.fields[x,y] == player:
pieces = pieces + 1
elif self.fields[x,y] == self.empty:
height.append(y)
heightscore = self.height - int(sum(height) / float(len(height)))
score=score+pieces*heightscore
return score

def __winpositions(self, lines, player):
lines = self.__winlines(player)
winpositions = {}
for line in lines:
pieces = 0
empty = None
for x,y in line:
if self.fields[x,y] == player:
pieces = pieces + 1
elif self.fields[x,y] == self.empty:
if not empty == None:
break
empty = (x,y)
if pieces==3:
winpositions["{0},{1}".format(x,y)]=True
return winpositions

def __winlines(self, player):
lines = []
# horizontal
for y in range(self.height):
winning = []
for x in range(self.width):
if self.fields[x,y] == player or self.fields[x,y] == self.empty:
winning.append((x,y))
if len(winning) >= 4:
lines.append(winning[-4:])
else:
winning = []
# vertical
for x in range(self.width):
winning = []
for y in range(self.height):
if self.fields[x,y] == player or self.fields[x,y] == self.empty:
winning.append((x,y))
if len(winning) >= 4:
lines.append(winning[-4:])
else:
winning = []
# diagonal
winning = []
for cx in range(self.width-1):
sx,sy = max(cx-2,0),abs(min(cx-2,0))
winning = []
for cy in range(self.height):
x,y = sx+cy,sy+cy
if x<0 or y<0 or x>=self.width or y>=self.height:
continue
if self.fields[x,y] == player or self.fields[x,y] == self.empty:
winning.append((x,y))
if len(winning) >= 4:
lines.append(winning[-4:])
else:
winning = []
# other diagonal
winning = []
for cx in range(self.width-1):
sx,sy = self.width-1-max(cx-2,0),abs(min(cx-2,0))
winning = []
for cy in range(self.height):
x,y = sx-cy,sy+cy
if x<0 or y<0 or x>=self.width or y>=self.height:
continue
if self.fields[x,y] == player or self.fields[x,y] == self.empty:
winning.append((x,y))
if len(winning) >= 4:
lines.append(winning[-4:])
else:
winning = []
# return
return lines

def __iterative_deepening(self,think):
g = (3,None)
start = time()
for d in range(1,10):
g = self.__mtdf(g, d)
if time()-start>think:
break
return g;

def __mtdf(self, g, d):
upperBound = +1000
lowerBound = -1000
best = g
while lowerBound < upperBound:
if g == lowerBound:
beta = g+1
else:
beta = g
g = self.__minimax(True, d, beta-1, beta)
if g!=None:
best = g
if g < beta:
upperBound = g
else:
lowerBound = g
return best

def __minimax(self, player, depth, alpha, beta):
lower = Board.nodes.get(str(self)+str(depth)+'lower',None)
upper = Board.nodes.get(str(self)+str(depth)+'upper',None)
if lower != None:
if lower >= beta:
return (lower,None)
alpha = max(alpha,lower)
if upper != None:
if upper <= alpha:
return (upper,None)
beta = max(beta,upper)
if self.won():
if player:
return (-999,None)
else:
return (+999,None)
elif self.tied():
return (0,None)
elif depth==0:
return (self.__heuristic(),None)
elif player:
best = (alpha,None)
for x in range(self.width):
if self.fields[x,self.height-1]==self.empty:
value = self.move(x).__minimax(not player,depth-1,best,beta)
if value>best:
best = value,x
if value>beta:
break
else:
best = (beta,None)
for x in range(self.width):
if self.fields[x,self.height-1]==self.empty:
value = self.move(x).__minimax(not player,depth-1,alpha,best)
if value<best:
best = value,x
if alpha>value:
break
if best <= alpha:
Board.nodes[str(self)+str(depth)+"upper"] = best
Board.nodes[self.__mirror()+str(depth)+"upper"] = best
elif best >= beta:
Board.nodes[str(self)+str(depth)+"lower"] = best
Board.nodes[self.__mirror()+str(depth)+"lower"] = best
return best

def best(self):
return self.__iterative_deepening(2)

def tied(self):
for (x,y) in self.fields:
if self.fields[x,y]==self.empty:
return False
return True

def won(self):
# horizontal
for y in range(self.height):
winning = []
for x in range(self.width):
if self.fields[x,y] == self.opponent:
winning.append((x,y))
if len(winning) == 4:
return winning
else:
winning = []
# vertical
for x in range(self.width):
winning = []
for y in range(self.height):
if self.fields[x,y] == self.opponent:
winning.append((x,y))
if len(winning) == 4:
return winning
else:
winning = []
# diagonal
winning = []
for cx in range(self.width-1):
sx,sy = max(cx-2,0),abs(min(cx-2,0))
winning = []
for cy in range(self.height):
x,y = sx+cy,sy+cy
if x<0 or y<0 or x>=self.width or y>=self.height:
continue
if self.fields[x,y] == self.opponent:
winning.append((x,y))
if len(winning) == 4:
return winning
else:
winning = []
# other diagonal
winning = []
for cx in range(self.width-1):
sx,sy = self.width-1-max(cx-2,0),abs(min(cx-2,0))
winning = []
for cy in range(self.height):
x,y = sx-cy,sy+cy
if x<0 or y<0 or x>=self.width or y>=self.height:
continue
if self.fields[x,y] == self.opponent:
winning.append((x,y))
if len(winning) == 4:
return winning
else:
winning = []
# default
return None

def __mirror(self):
string = ''
for y in range(self.height):
for x in range(self.width):
string+=' '+self.fields[self.width-1-x,self.height-1-y]
string+="\n"
return string

def __str__(self):
string = ''
for y in range(self.height):
for x in range(self.width):
string+=' '+self.fields[x,self.height-1-y]
string+="\n"
return string

class GUI:

def __init__(self):
self.app = Tk()
self.app.title('Connect4')
self.app.resizable(width=False, height=False)
self.board = Board()
self.buttons = {}
self.frame = Frame(self.app, borderwidth=1, relief="raised")
self.tiles = {}
for x in range(self.board.width):
handler = lambda x=x: self.move(x)
button = Button(self.app, command=handler, font=Font(family="Helvetica", size=14), text=x+1)
button.grid(row=0, column=x, sticky="WE")
self.buttons[x] = button
self.frame.grid(row=1, column=0, columnspan=self.board.width)
for x,y in self.board.fields:
tile = Canvas(self.frame, width=60, height=50, bg="navy", highlightthickness=0)
tile.grid(row=self.board.height-1-y, column=x)
self.tiles[x,y] = tile
handler = lambda: self.reset()
self.restart = Button(self.app, command=handler, text='reset')
self.restart.grid(row=2, column=0, columnspan=self.board.width, sticky="WE")
self.update()

def reset(self):
self.board = Board()
self.update()

def move(self,x):
self.app.config(cursor="watch")
self.app.update()
self.board = self.board.move(x)
self.update()
move = self.board.best()
if move!=None:
self.board = self.board.move(move)
self.update()
self.app.config(cursor="")

def update(self):
for (x,y) in self.board.fields:
text = self.board.fields[x,y]
if (text=='.'):
self.tiles[x,y].create_oval(10, 5, 50, 45, fill="black", outline="blue", width=1)
if (text=='X'):
self.tiles[x,y].create_oval(10, 5, 50, 45, fill="yellow", outline="blue", width=1)
if (text=='O'):
self.tiles[x,y].create_oval(10, 5, 50, 45, fill="red", outline="blue", width=1)
for x in range(self.board.width):
if self.board.fields[x,self.board.height-1]==self.board.empty:
self.buttons[x]['state'] = 'normal'
else:
self.buttons[x]['state'] = 'disabled'
winning = self.board.won()
if winning:
for x,y in winning:
self.tiles[x,y].create_oval(25, 20, 35, 30, fill="black")
for x in range(self.board.width):
self.buttons[x]['state'] = 'disabled'

def mainloop(self):
self.app.mainloop()

if __name__ == '__main__':
GUI().mainloop()
```

http://en.wikipedia.org/wiki/MTD-f

http://chessprogramming.wikispaces.com/MTD%28f%29

http://people.csail.mit.edu/plaat/mtdf.html

Share

## Python TicTacToe with Tk and minimax AI Whenever I’m learning a new language I always implement some little games. My favorite 3 games to are:

• TicTacToe
• Connect4
• Minesweeper

These are easy to implement since they are easy point-and-click games that do not require complex graphics or animation. They will require some multi-dimensional arrays fiddling and basic logic to implement. The GUI needed is pretty straight-forward and can be build using a window and a grid of buttons. Any window toolkit will provide these. If you like the idea of programming a game to learn a language, but you don’t like these specific games, check out “49 ideas for game clones to code“. Note that learning Python by programming small computer games is also suggested on “inventwithpython.com” (free e-book).

If you are not attracted to game programming you might want to check out “Python koans“.

Koan: a paradoxical anecdote or riddle without a solution, used in Zen Buddhism to demonstrate the inadequacy of logical reasoning and provoke enlightenment.

The Python Koans is a set of exercises that will make you learn Python. It is inspired by the “Ruby Koans“, which are also very good. You will run the tests and see them fail (red) and by changing the code you can make the tests pass (green). This is just as the Test Driven Development (TDD) method prescribes. So the Koans make you learn programming and TDD at the same time. Awesome!

### TicTacToe in Python with “minimax” AI

This is my code for TicTacToe. It uses Tk since this is included in the Windows and MacOSX distributions of Python. In Linux “python-tk” is easy to install using the package manager. The game implements the “minimax” algorithm for Artificial Intelligence (AI). It has no heuristic (depth parameter and horizon effect) so it plays at infinite strength. With this game that is possible, because of it limited set of possible states.

```          # coding=UTF8
from Tkinter import Tk, Button
from tkFont import Font
from copy import deepcopy

class Board:

def __init__(self,other=None):
self.player = 'X'
self.opponent = 'O'
self.empty = '.'
self.size = 3
self.fields = {}
for y in range(self.size):
for x in range(self.size):
self.fields[x,y] = self.empty
# copy constructor
if other:
self.__dict__ = deepcopy(other.__dict__)

def move(self,x,y):
board = Board(self)
board.fields[x,y] = board.player
(board.player,board.opponent) = (board.opponent,board.player)
return board

def __minimax(self, player):
if self.won():
if player:
return (-1,None)
else:
return (+1,None)
elif self.tied():
return (0,None)
elif player:
best = (-2,None)
for x,y in self.fields:
if self.fields[x,y]==self.empty:
value = self.move(x,y).__minimax(not player)
if value>best:
best = (value,(x,y))
return best
else:
best = (+2,None)
for x,y in self.fields:
if self.fields[x,y]==self.empty:
value = self.move(x,y).__minimax(not player)
if value<best:
best = (value,(x,y))
return best

def best(self):
return self.__minimax(True)

def tied(self):
for (x,y) in self.fields:
if self.fields[x,y]==self.empty:
return False
return True

def won(self):
# horizontal
for y in range(self.size):
winning = []
for x in range(self.size):
if self.fields[x,y] == self.opponent:
winning.append((x,y))
if len(winning) == self.size:
return winning
# vertical
for x in range(self.size):
winning = []
for y in range(self.size):
if self.fields[x,y] == self.opponent:
winning.append((x,y))
if len(winning) == self.size:
return winning
# diagonal
winning = []
for y in range(self.size):
x = y
if self.fields[x,y] == self.opponent:
winning.append((x,y))
if len(winning) == self.size:
return winning
# other diagonal
winning = []
for y in range(self.size):
x = self.size-1-y
if self.fields[x,y] == self.opponent:
winning.append((x,y))
if len(winning) == self.size:
return winning
# default
return None

def __str__(self):
string = ''
for y in range(self.size):
for x in range(self.size):
string+=self.fields[x,y]
string+="\n"
return string

class GUI:

def __init__(self):
self.app = Tk()
self.app.title('TicTacToe')
self.app.resizable(width=False, height=False)
self.board = Board()
self.font = Font(family="Helvetica", size=32)
self.buttons = {}
for x,y in self.board.fields:
handler = lambda x=x,y=y: self.move(x,y)
button = Button(self.app, command=handler, font=self.font, width=2, height=1)
button.grid(row=y, column=x)
self.buttons[x,y] = button
handler = lambda: self.reset()
button = Button(self.app, text='reset', command=handler)
button.grid(row=self.board.size+1, column=0, columnspan=self.board.size, sticky="WE")
self.update()

def reset(self):
self.board = Board()
self.update()

def move(self,x,y):
self.app.config(cursor="watch")
self.app.update()
self.board = self.board.move(x,y)
self.update()
move = self.board.best()
if move:
self.board = self.board.move(*move)
self.update()
self.app.config(cursor="")

def update(self):
for (x,y) in self.board.fields:
text = self.board.fields[x,y]
self.buttons[x,y]['text'] = text
self.buttons[x,y]['disabledforeground'] = 'black'
if text==self.board.empty:
self.buttons[x,y]['state'] = 'normal'
else:
self.buttons[x,y]['state'] = 'disabled'
winning = self.board.won()
if winning:
for x,y in winning:
self.buttons[x,y]['disabledforeground'] = 'red'
for x,y in self.buttons:
self.buttons[x,y]['state'] = 'disabled'
for (x,y) in self.board.fields:
self.buttons[x,y].update()

def mainloop(self):
self.app.mainloop()

if __name__ == '__main__':
GUI().mainloop()
```

Copy the above code into a file and save that as “tictactoe.py”. Then execute it using “python tictactoe.py”. Enjoy!

### Python3 compatibility and dependencies

The code is fully Python3 compatible, you only need to change the names of the external modules, so that:

```from Tkinter import Tk, Button
from tkFont import Font
```

Becomes:

```from tkinter import Tk, Button
from tkinter.font import Font
```

The packages you need to install on Ubuntu Linux for Python3 may be called “python3” and “python3-tk”. In order to run the code in Python3 you need to use the “python3” command instead of the normal “python” one.

Share