Bachelor Thesis A. I. in board games Aron Lindberg, s042422 Technical University of Denmark Informatics and Mathematical Modelling Thesis tutored by Thomas Bolander August 19, 2007 i Abstract In this thesis, a board game named Kolibrat is implemented in the Objective-C programming language and Apple Computers Cocoa API. Besides from documenting this work, the thesis focus on the development of arti? cial players. ii Resum? e I denne afhandling bliver et br? tspil Kolibrat implementeret i programmerings sproget Objective-C og Apple Computers Cocoa API. Udover at dokumenterer dette arbejde fokuserer denne afhandling p? rbejdet med a udviklingen af kunstigt intelligente spillere. iii Contents 1 Preface 1. 1
Chapter 1 Preface The ? rst thing to do when one is about to write a thesis about developing arti? cial intelligence for a board game, it to choose the actual board game. The choice is not critical, but the game should have certain properties. The properties that I have looked for in the games, that I have considered to use in this thesis, is basically two things. First the game must have no clear strategy for winning, meaning that there should be no set of obvious moves that ensure one of the players victory no matter what the other player does. The second property is that I want a game with simple rules and is easy to learn.
Still the game must be di? cult to predict and master. Hopefully the game also satisfy other properties like being entertaining, but these properties is less important than the ? rst two. After discussing the choice of possible games, my tutor Thommas Bolander and I have decided to use the rather unknown board game Kolibart as it seems to satisfy all the properties given above. In addition it can be played on multiple board sizes. With the game chosen the only thing left to decide is the means of implementation. The most logic choice here would be Java, as it has been the programming language of choice in most classes.
But since I believe in variety, and that learning something new is always good, I have made the choice to implement Kolibrat in the less known language Objective-C. I choose this language because it seems to be a powerful and structured language. The language is also the language of choice for programming 1 2 CHAPTER 1. PREFACE on an Apple Macintosh as Apple supplies a powerful IDE and API that uses Objective-C as its foundation. 1. 1 Preconditions In order to get most out of this thesis the reader is expected to be familiar with the rules of Kolibrat, if not they can be found on page 50 in appendix A.
Understanding Objective-C would be a great help in order to understand the source code, but since Objective-C is a superset of ANSI C most of the source code is readable to people with a good understanding of C. Likewise the reader is also expected to know the UML notation. Objective-C is however a unique programming language, and it uses some terms that is unique to the language. To avoid misunderstandings words that have a special meaning in Objective-C or could be misleading if shortly explained below. Protocol Is the same to Objective-C as interfaces is to Java. It speci? es a list of methods that the class must implement.
Delegate Certain Objective-C classes have delegate methods. If a delegate for some object has been set, all method calls to its delegate methods are forwarded to the delegate object, but only if the delegate object implements a method by that name. Noti? cation server Cocoa implements a noti? cation system that allows all objects to send and receive event messages, even if they have no idea about who the sender or receiver might be. This is used for program wide signaling of events. Sheet Is a special window that is attached to another window, and glides in above that window.
This is often used to display save dialogues, warnings, or simple options, as it locks the window below from the user. NIB ? les A ? le format used to store the GUI interface in Cocoa. It consist of an XML list and interface objects. (NeXT Interface Builder) 1. 2. AIMS AND LIMITATIONS 3 1. 2 Aims and limitations Since time is always a limit, not everything of interest can be tried out and tested to its full extend. Therefor some topics, interesting though they are, will not be touched in this thesis. To limit the thesis a set of objectives and limitations has been made, they are speci? ed below.
Implementing a working game Implementing the game has the ? rst priority, as a working game is necessary in order to run and test arti? cial players. The game itself should be simple, but look and feel elegant, while giving the user a way to set the options he needs. The game should also be constructed with ? exibility in mind, meaning that it should not rely on constants, but variables that can change at runtime. This ensures that almost all aspects of the game can be easily altered either by the user or the programmer. Implementing AI Implementation of the arti? cial intelligence in Kolibrat is the main area of focus in this thesis.
As with the implementation of Kolibrat I want the arti? cial players to be ? exible. Hopefully the game can be implemented in a way that allows it to load di? erent AI’s as plug-ins at runtime. This will allow the user to add or remove AI’s as he pleases. Writing the arti? cial players as plug-ins also has the advantage, that other programmers can develop their own AI’s without having the source code for Kolibrat. The plan is to spend as much time on the development of arti? cial intelligence as possible. As a minimum requirement, at least one AI using the MiniMax algorithm, must be implemented and optimized as much as possible. . 2. 1 Limitations On the other hand technologies like Sound and 3D Graphics will not be touched in this thesis. Even though they are interesting add-ons, they are not essential for the game experience. In a small game like Kolibrat, one could even argue that this is an advantage since the game can be played while the user performs other activities. 4 CHAPTER 1. PREFACE Adding Network game-play would be interesting, but is not essential and therefor not a topic that will be touched further, in this thesis. The same goes for the ability to undo moves and saving or loading games. 1. 3 Structure of thesis
The main content of this thesis is structured into two chapters. Chapter 2 deals with the development, design and implementation of Kolibrat. Afterwards chapter 3 deals with a mathematic analyze of the game Kolibrat, along with the development and implementation choices made while developing the arti? cial players. Section 2. 1 goes through the details of the basic design choices made while developing Kolibrat. Section 2. 2 deals with the development of the games graphical user interface. Section 2. 3 deals with the more detailed design of Kolibrats internal objects and goes into details about the implementation of Kolibrat.
Finally section 2. 4 concludes the ? rst chapter by going into details about the White-Box testing done on Kolibrat. Chapter 3 starts o? with a longer mathematical analyze of Kolibart in section 3. 1. Section 3. 2 gives an introduction to the ? eld of arti? cial intelligence in two player games, it also gives a brief overview of the available algorithms used in these situations. Section 3. 3 deals with the implementation of the MiniMax algorithm, and is followed by section 3. 4 that deals with optimizing the heuristic function used by the MiniMax algorithm. Finally section 3. 6 compares the di? rent arti? cial players that has been developed for Kolibrat. The thesis is ? nished with a chapter of conclusions, giving a summary of the achievements in the thesis. Appendix A contains the Kolibrat rule-book and appendix B gives details on the the White Box tests performed on Kolibrat. The following appendixes lists the source code for the entire Kolibrat game, and its arti? cial player. Chapter 2 Game Development This chapter deals with all aspects of the development of Kolibrat. First a conceptual design of Kolibrat is devised. After that there is a section on user interface development.
These sections are followed by a section on game design and then by one detailing the game implementation. 2. 1 Concept design From the very beginning Kolibrat was developed with ? exibility in mind, thus making it easy to extend and change later on, it also follows the model, view, control design pattern. This allow the game to run without a GUI, or to run with di? erent GUI’s without the need to change anything in Kolibrats data structure. The initial concept of the game design is shown on ? gure 2. 1. View GUI Controll Game Controller Game Engine Players Model Figure 2. 1: Concept Design 6 CHAPTER 2. GAME DEVELOPMENT 2. 1. 1 Graphical user interface The GUI must allow the player to make certain choices at the beginning of a game, the most important being the board size, the players type and player names. In a game the GUI must show the score and the game board, while at the same time give the user a chance to quit or restart the game. Another important part of the GUI is to redirect all mouse clicks to other parts of the program, like the player objects. 2. 1. 2 Game Controller The game controller must be able to control the game window and the new game options.
In addition to that it should send the options chosen in the beginning of a game to the Game Engine. It must also tell the GUI to display game over information when one of the players wins. 2. 1. 3 Game Engine The Game Engine will handle the game logic, like the game rules. It must also send game information to the players, giving them a change to move, and send these moves on to the GUI. 2. 1. 4 Players The player classes must respond to move requests made by the Game Engine and return moves to the Game Engine. In order to allow di? erent types of player classes to work with the Game Engine, the game engine should be able to ommunicate with player objects that does not necessary inherit from or subclass each other. 2. 2 User Interface The user interface is in many ways the most important part of a program since it is the link between the user and the program. The goal with the development of Kolibrats interface has been to create a simple, complete and elegant GUI. One that gives the user the options he needs and nothing else. The best way to make the GUI simple, would be to make the game a single windowed application, that shows the game board and use a sheet to change game settings.
This approach has a number of advantages 2. 2. USER INTERFACE 7 compared to having more than one window. First of all window control becomes simpler and the ? nal design takes up less space. Besides from that is makes sense to use a sheet because it ensures that no information on game settings is shown when it is not needed, and the sheet locks the game window ensuring that the player can not move pieces around while setting up a new game. 2. 2. 1 Flow control In order to ensure a logical ? ow of events through the game a ? ow diagram has been made, and can be seen on ? gure 2. 2.
The only option available to the user at launch is the new game option, which sends him on to the dialogue for choosing game settings for a new game. When the user is done, the game will start with a red as the ? rst player to move. The states now alternate between the states where either black or red has the turn until the game is over. When the game is over the user will be able to start a new game, restart the game or quit, but this is not shown in ? gure 2. 2 in order to simplify it. As seen in ? gure 2. 2 the user also have the choice to begin a new game, restart or quit, at any time he wants. New Game Dialog Choose settings
Red player to move Restart Game Over Black player to move Quit Figure 2. 2: UML Flow Control Diagram. 8 CHAPTER 2. GAME DEVELOPMENT 2. 2. 2 GUI implementation The user interface has been made in the program Interface Builder and is stored in an NIB ? le. Thus no actual code has been written to produce the GUI. The GUI is linked to the source code by using identical variable and class names in both the GUI and the source code. The GUI itself is designed to follow Apples Human Interface Guidelines, to produce a game that looks familiar to a mac user. A picture of the user interface is shown in fugue 2. 3.
Figure 2. 3: Kolibrat settings and game board. The new game window allow the player to choose the size of the game board, the maximum amount of pieces each player can have on the board and the number of goals a player must gain to win. In this layout the users can only choose from predetermined board sizes. This is not a technical limit, the game can at runtime begin a game on any board size, but in order to save the user from choosing stupid board sizes like 1×1 or 50×99 the choice of boards has been limited to some predetermined board sizes. The game window can also be resized by the user. The maximum size . 3. GAME IMPLEMENTATION 9 of the window is when each square on the game board has a dimension of 128×128 pixels. From that the window can be scaled down until each square has a dimension of 64×64 pixels. The algorithms that handles the resizing, takes care to ensure that the window keeps its proportions while resizing. An example of this can be seen on ? gure 2. 4. Figure 2. 4: Big and small game window. 2. 3 Game Implementation In this section, the design from ? gure 2. 1 is re? ned into a more exact model representing actual objects in the ? nal game. The result is shown in diagram 1. The most notable di? rences in the new layout is that the control prat of the game has been spilt into two controller objects. NewGameController handles the window used to start new games. The GameController is used to control the main game window. The GameController also controls a customised NSView that draws the actual game board in the GameWindow. Another change is the addition of the GameLogic object. This functionality has been moved from GameEngine into its own object. This will allow 10 CHAPTER 2. GAME DEVELOPMENT NewGameOptionsWindow – The Window containing options for new games. 1 NSView GameBoard – Draws the game board in the GameWindow.
GameWindow – The main window, shows the game board and the score. 1 1 NewGameController – Controls the NewGameOptions window. – Loads all player objects. – Responds to user actions 1 from the menu (New Game and Restart Game). – Connects the players, GameEngine and GameController when a new game is started. – Delegate of MainMenu. – Receives mouse clicks on the game board 1 GameController 1 – Updates the GameBoard and the GameWindow. – Delegate of NSApp and GameWindow. 0.. 1 1 1 GameEngine Player Objects GameLogic – Handles all rules in the game. – Can ? nd legal moves on the game board. 1 1 Receives moves from players, and validates that they came from the player with the turn. – Sends update info to the GameController. – Tells the players when it is their turn, and gives them methods to see the game board. 1 2 – Can be many different types of objects. – Sends moves to the GameEngine. – Might use the GameLogic to ? nd legal moves. Diagram 1: Concept UML Diagram other objects like arti? cial players to use the same object, for ? nding moves and move around on the game board, as the GameEngine does. A little less noticeable is it that GameBoard is in charge of all mouse clicks that is received on the game board.
This has the advantage that since GameBoard draws the game board, it can easily transform mouse coordinates into game board coordinates. The game board coordinates can then be passed on the the rest of Kolibrat by using the Noti? cation system. The games menu bar is controlled by the NewGameController. This is the only logic choice since the menu bars main features are restarting or starting a new game. It is also the class responsible for loading the plug-in player objects. Also a player object is now de? ned as a object 2. 3. GAME IMPLEMENTATION 11 that implements and responds to a PlayerProtocol.
The exact interface in this protocol can be seen on page 104. A GUIProtocol has also been de? ned and speci? es the methods a GUI must implement, if one wants to make a new GUI. This could be a terminal interface, or a GUI for the web. 2. 3. 1 Class details This section goes into details about the individual classes in Kolibrat, and explain the workings of some selected methods. The individual methods are not discussed sine most methods is self-explanatory and the source code is well documented and have a small description of almost all methods in the game.
GameLogic The GameLogic class is the object that implements all the rules in Kolibrat. When initialized the object receives information on the maximal number of pieces that players can have on the board, the amount of goals needed to win and the board size. With this information the class can return legal moves and make moves on a GameState. Some of the methods have been implemented in C to improve performance, since that became an issue with the development of arti? cial players. In ? gure 2. 5 the methods whose return type is not enclosed in brackets is implemented in C. GameLogic int maxGoals – int maxPicesOnBoard – BoardSize boardSize + + + + + + + + + + + + + + + (void)changePlayerOn:(GameState *)gs (id)initWithMaxPices:(int)max goalsToWin:(int)goals boardSize:(BoardSize)board (GameState)CreateNewGameState (void)resetGameState:(GameState *)gs (BOOL)currentPlayerCanInsertPieceOnState:(GameState *)gs (NSSet *)legalMovesForPiceInField:(BoardField)field withState:(GameState *)gs (NSSet *)allLegalMoves:(GameState *)gs (BOOL)makeMove:(BoardMove)playerMove withState:(GameState *)gs (BOOL)playerMovingCanInsertPieceOnState:(GameState *)gs SimpleList allLegalMoves(GameState *gs) void legalMovesForPiceInField(BoardField *field, GameState *gs, SimpleList *superList, SimpleList *goodList, SimpleList *badList) BOOL makeMoveOnState(BoardMove *playerMove, GameState *gs) void freeGameState( GameState *state ) GameState copyGameState( GameState *state ) BOOL blackPlayerAdheadOf(int x, int y, GameState *gs) BOOL redPlayerAdheadOf(int x, int y, GameState *gs) Figure 2. : The GameLogic class 12 GameEngine CHAPTER 2. GAME DEVELOPMENT The GameEngine class stores the games state. Is parses game information on to the GUI, if one is connected. It also handles all communication between the players and the rest of Kolibrat. In order to avoid having to maintain two set of game rules the engine uses GameLogic to validate the players moves, by searching for the players move in the set of legal moves returned by GameLogic. The class also ensure that there is at least 0. 5 second between two moves to make games between two fast arti? cial players watchable. The methods in GameEngine can be seen on ? gure 2. 6. GameEngine
GameLogic *gl id redPlayer id blackPlayer id engineGUI int maxGoals BOOL connectedToGUI BoardSize boardSize GameState realGameState GameState* gameStatePointer NSNotificationQueue *queue BOOL delayNexPlayer; BOOL doCallNextPlayerWhenResume; NSDate *timeToNextTurn; NSDate * timeToStoptheGame; – (void)dealloc + (id)initWithPlayersRed:(id)red andBlack:(id)black goalsToWin:(int)goals GameBoardDim:(BoardSize)board MaxPices:(int)max connectToGUI:(id)gui + (void)nextPlayer:(NSNotification*)notification + (void)resetGame + (BOOL)playerMove:(BoardMove) playerMove fromPlayer:(id)player + (GameState)gameState + (void)SelectedPiece:(BoardField)bf fromPlayer:(id)player + (void)delayNextPlayer:(BOOL)response Figure 2. 6: The GameEngine class The method SelectedPiece: can be called by the players on pieces they want highlighted in the GUI.
And the delayNextPlayer: method is called by NewGameController to stop a game between two arti? cial players when the user brings up the sheet with NewGameOptions. Human Player This class implements the human player. It works by receiving mouse click noti? cations from the GameBoard. When appropriate the human player 2. 3. GAME IMPLEMENTATION 13 object returns possible moves to the GameEngine. Besides form this, the object only implement the methods required by the PlayerProtocol. Player Protocol Human Player GameEngine *engine int playerID BOOL waitingForOtherPlayer BoardField firstClick BoardField secondClick NSString *name Can be other classes (void)dealloc + (id)initAsPlayer:(int)player withName:(NSString *)playerName boardSize:(BoardSize)bs picesOnboard:(int)maxPices + (void)mouseClickNotification:(NSNotification *)notification + (void)setGameEngine:(id)ge + (void)startNewTurn + (void)reset + (NSString *)playerName + (NSString *)playerType Figure 2. 7: The HumanPlayer class GameController This class is the link between the interface and the data model. It receives information from the GameEngine and sends the appropriate information on to the GameWindow. This class is initialized by the awakeFromNib: method that is called by the GameWindow when its NIB ? le is loaded at launch. The classes variables and methods can be seen on ? gure 2. 8. At compile time IBOutlet is converted to a pointer, and IBAction to void. IBOutlet and IBAction us only used to inform the programmer and the compiler that this is a variable or method that is linked to the GUI through a NIB ? le.
The GameController is a delegate of NSApplication and implements the delegate method applicationShouldTerminate… : that terminates the application when the last window is closed. The gameOverWithWinner: method is called by the GameEngine when a game is over, and when the user responds to the game over dialog the gameDidend: method is called to cope with the users response. 14 CHAPTER 2. GAME DEVELOPMENT GUI Protocol NSWindowController GameWindow (NIB File) GameController + + + + + + + + + + + IBOutlet NSTextField *blackScore IBOutlet NSTextField *redScore IBOutlet GameBoard *gb IBOutlet NSWindow *OptionsWindow IBOutlet NSMenuItem *restartMenu GameEngine *ge BOOL canRestartGame BOOL doHighlighting BoardSize boardSize float boardFieldDim NSSize)windowWillResize:(NSWindow *)sender toSize:(NSSize)proposedFrameSize (BOOL)applicationShouldTerminateAfterLastWindowClosed:(NSApplication *)app (void)awakeFromNib (void)gameOverWithWinner:(NSString *)playerName (void)setScore:(GameScore)score (void)highlightField:(BoardField)bf (void)redrawOriginalState (void)updateToState:(GameState)bs (void)highlightPiceAt:(BoardField)bf (void)drawOpaquePiceAt:(BoardField)bf forPlayer:(int)player (void)gameDidEnd:(NSWindow *)sheet returnCode:(int)returnCode contextInfo:(void *)contextInfo + (void)setGameEngine:(GameEngine*)engine + (void)setHighlightState:(BOOL)highlight + (void)setBoardSize:(BoardSize)board Figure 2. 8: The GameController class GameBoard The GameBoard class handles drawing the game board in the main window. It to is loaded by GameWindow at launch.
Most of its methods is related to drawing the board and its pieces. The mouseDown: method is the method that receives mouse clicks, and transforms them into board coordinates. All actual drawing is done in the drawRect: method, as it is automatically called when the system want to redraw the window. The GameWindow uses some quite advanced calculations to resize the game window. The setSquareDim: and setDisplayOffset: is part of this and ensures that the game board have the right size and is placed at the correct distance from the edge of the window. 2. 3. GAME IMPLEMENTATION 15 NSView GameBoard + + + + + + + + + + + float squareDim int **HighlightArray int **PicesArray BoardSize boardSize float offset void)awakeFromNib (NSRect)RectForField:(BoardField)bf (void)setColorForField:(BoardField)bf (id)initWithFrame:(NSRect)frameRect (void)drawRect:(NSRect)rect (void)mouseDown:(NSEvent*)event (void)highlightField:(BoardField)bf (void)redrawOriginalState (void)drawPicesFromBoard:(GameState)gs (void)highlightPiceAt:(BoardField)bf (void)drawOpaquePiceAt:(BoardField)bf forPlayer:(int)player + (void)setBoardSize:(BoardSize)board + (void)setSquareDim:(float)dim + (void)setDisplayOffset:(float)distance Figure 2. 9: The GameBoard class NewGameController The NewGameController class handles all aspects of new game creation. At launch it checks the games plug-in folder for additional players and loads these into an array of players. When the user choose to start a new game the NewGameController displays the NewGameOptionsWindow above the game window as a sheet. In the NewGameOptionsWindow the user can choose the game settings, along with other options.
The class can be seen on ? gure 2. 10. The validateMenuItem: is used to enable or disable items in the menu bar. This method is used to disable the Restart Game menu item until a game has been started. The method is called on the NewGameController because it is the delegate of the NSMenu class. The methods newGame: and restartGame: is the classes that is executed when the user chooses these options in the menu bar. The defaultsButton:, cancelButton: and startGameButton: is the methods that is executed when the user pushes these options in the NewGameOptionsWindow. 16 CHAPTER 2. GAME DEVELOPMENT NewGameOptions Window (NIB File) NewGameController + + + + + + + IBOutlet NSTextField *blackName IBOutlet NSPopUpButton *blackType IBOutlet NSPopUpButton *boardpopUPMenu IBOutlet NSTextField *goals IBOutlet NSTextField *redName IBOutlet NSPopUpButton *redType IBOutlet GameController *gc IBOutlet NSWindow *newGameWindow IBOutlet NSWindow *gameWindow IBOutlet NSButton *highlightInGUI IBOutlet NSTextField *maxPices IBOutlet NSStepper *goalsStepper IBOutlet NSStepper *piecesStepper GameEngine *newEngine NSMutableArray *playersType NSMutableDictionary *playerIdentefiers (void)awakeFromNib (void)loadPlayers (BOOL)validateMenuItem:(NSMenuItem *)item (IBAction)defaultsButton:(id)sender (IBAction)startGameButton:(id)sender (IBAction)cancelButton:(id)sender; (IBAction)newGame:(id)sender (IBAction)restartGame:(id)sender Figure 2. 10: The NewGameController class 2. 3. 2 Kolibrat ? ow diagrams In order to better understand the ? ow the events through Kolibrat, when the game is running, some UML ? ow diagrams has been made to illustrate this. Due to their size the actual ? ow diagrams is shown in the back of the report, behind the appendix. Diagram 2 is shown on page 163 and displays the program ? ow at game launch. Diagram 3 is shown on page 164 and details the ? ow while playing through one turn. Diagram 4 is shown on page 165 and shows the ? w of events when one of the players win a game. 2. 3. 3 C data structures This section describes the data structures used in Kolibrat. It gives a super? cial explanation on how the data structures is implemented and focus on what the structures is used for. To see the actual implementation 2. 3. GAME IMPLEMENTATION 17 and how the structures are made from primitive C variables see the implementation on page 64. GameStatus This data structure is used to store the game status. That is whether or not the game is over, and in that case who the winner is. This data structure is mostly used in the larger data structure GameState that stores all data on a state in the game.
BoardField This structure is used to parse coordinates for pieces around in the game. It is constructed of two short integers that respectively gives the x and y coordinate of the piece in question. BoardMove This structure is made up of two BoardField structures, and denotes the ? eld a piece moves from and the ? eld it moves to. This structure is used widely in the game, GameLogic uses it to store legal moves and the player objects uses this structure to pass the moves they want to make on to the GameEngine. GameScore This structure stores the score for both players, in two short integer variables. This structure is mostly used in the larger gamestate structure.
BoardSize This structure stores the size of the game board. All objects that works with pieces on the game board needs to know the size of the game board in order to avoid array out of bounce errors, they uses this structure to store that knowledge. BoardFieldContent This is a small structure de? ning two boolean variables. One to indicate that this BoardField is occupied by red and one to indicate it is occupied by black. This is used by the GameStatus in a two-dimensional array to store the location of the pieces of the board. GameState This structure is used by the GameEngine to store all information on the game, and by the AI’s when they construct a game tree.
Aside form containing a two-dimensional array of BoardFieldContent structures, a GameScore and GameStatus structure it also stores info on the player moving, and the amount of pieces each player have on the board. 18 CHAPTER 2. GAME DEVELOPMENT The GameState structure has been designed to save as much space as possible. The score is stored in an unsigned short integer for both players, as each unsigned short integer is 16 bits the total size of this is 32 bits. The placement of pieces is stored in a two-dimensional array with the size of the game board of booleans for both players, assuming the game board have 3×4 ? elds this gives 3 · 4 · 2 = 24 bits of data to store the location of both players pieces. The player who have the turn is stored in a boolean variable and only takes up one bit.
The number of pieces that both player has on the board is also stored in an unsigned short integer and thus takes up 2×16 = 32 bits, the same as the game scores. The game status is stored in two Boolean variables one to tell whether the game is over and one to tell the winner. This adds up to a total of 91 bits or a little less that 12 bytes. This value might vary, especially on 64 bit systems where some of the primitive C variables has changed size, compared to their 32 bit equals. 2. 4 Testing This section contains a description of the testing done on the kolibrat souse code. All classes has been severely black box tested, both by crash testing the compiled application, and through heavy use of the debuggers line by line execution provided by the IDE.
In addition to this the GameLogic part of the game has been put through a white box test, to ensure that it contains no errors and that the rules of the games is implemented as intended. All in all 20 test cases has been carried out testing di? erent aspects of the GameLogic. A short description of the tests is shown below. Fore a more in-depth description of the tests see appendix B for a detailed description of the tests. Or take a look at the source code of the test at in appendix C. 2. 1. Test 1 Insert pice for red in (1,0) on an empty board. Test 2 Insert pice for black in (1,3) on an empty board. Test 3 Trying to insert pice for red in (1,3) on an board with 4 red pieces. Test 4 Trying to insert pice for red in (2,2) while the board is empty. 2. 4.
TESTING Test 5 Score point for red. Test 6 Score point for black. Test 7 Try to insert piece in occupied ? eld. Test 8 Gamestatus is changed when red wins. Test 9 Ensure that no players can move when the game is over. Test 10 Ensure that no players can move outside of the board. Test 11-15 Tests of moves on a non empty board for red player. Test 16-20 Tests of moves on a non empty board for black player. 19 2. 4. 1 Test summery These tests combined tests all functions and all lines of code in GameLogic to ensure that it responds as expected in all game situations. This means that besides from testing all lines of code in GameLogic is also attempts to ? d any errors there could be in the implementation of the Kolibrat rule-book found in appendix A. During the tests two errors, that in some situations allowed both players to remove pieces from the game board, where found in test 11 and 16. The problem has been ? xed, so that GameLogic now passes all the tests. Chapter 3 Arti? cial intelligence This chapter focus on the development and implementation of arti? cial intelligent players. The ? rst section makes a longer analyses of the mathematical properties that Kolibrat possess. While the ? owing sections describe AI in general and the implementation and optimizations done on Kolibrats search techniques. 3. 1 Game Analysis To know what to expect from an arti? ial player, it is good to have an idea of what one can expect from Kolibrat AI. Therefor some mathematical studies of Kolibrat has been made prior to the AI development to determine properties like the branching factor and the size of the search space, meaning the number of the unique GameStates. These properties can tell if it is possible to solve the game completely, or how many moves an agent can be expected to search before a move must be returned. 3. 1. 1 Game state space The total number of di? erent states in a Kolibrat game important, since if this number is small enough the game can be solved in an attempt to ? nd a certain victory strategy for one of the players. To do this we ? st of all need a formula that describes the number of ways, a piece can be placed on the game board. If we de? ne that b is the number of ? elds on the game board and that 20 3. 1. GAME ANALYSIS 21 p is highest number of pieces, a player can have on the board. The ? rst piece can be placed in b di? erent places, and the second in b ? 1 ways. In other words if we have x pieces on the board they can be placed on f (x) di? erent ways. f (x) = b! (b ? x)! This formula works ? ne, but have the one ? aw since it considers all pieces to be di? erent. So if red has placed his ? rst piece in (1, 1) and his second piece in (1, 2) this board state is considered di? erent from a state where red placed his ? rst piece in (1,2) and the second in (1,1).
To solve this the result of the formula must be divided by the factorial number of pieces on the board, and this must be done for each player individually. If p1 is the number of pieces that the red player has on the board and p2 is the number of pieces that black has on the board the formula becomes (3. 1). b! p1 ! · p2 ! · (b ? x)! f (p1 , p2 ) = (3. 1) Formula (3. 1) gives the total number of di? erent ways to place pieces on the board. To get the size of the total amount of states the result from formula (3. 1) must be multiplied by two, since for each board position both players could have the turn, and all these states could each have any possible combination of scores.
If the number of goals that is required to win is s then formula (3. 1) must therefore be multiplied by (s + 1)2 ? 1. This gives the formula shown on equation (3. 2). b! · 2 · (s + 1)2 ? 1 p1 ! · p2 ! · (b ? x)! f (p1 , p2 ) = (3. 2) To get the total number of possible states the sum of all possible combinations of pieces is taken. This is done in equation (3. 3). p p p1 =0 p2 =0 b! · 2 · (s + 1)2 ? 1 p1 ! · p2 ! · (b ? x)! (3. 3) With formula (3. 3) it is now possible to calculate the total number of states that a Kolibrat game has, based on the size of the board and the 22 CHAPTER 3. ARTIFICIAL INTELLIGENCE highest number of pieces each player can insert.
In table 3. 1 calculations for the total state space can be seen for some common board sizes. The values is based on games that is won at 4 points. Breadth 2 3 3 3 4 4 5 9 9 Height 2 4 4 4 5 5 6 9 9 Max Pieces 2 4 5 6 5 6 6 15 30 Board positions 63 170019 343467 460815 123479901 509103141 1. 58 · 1011 4. 20 · 1030 2. 66 · 1038 Total States 3149 8500949 17173349 23040749 6173995049 25455157049 7. 91 · 1012 2. 10 · 1032 1. 33 · 1040 Table 3. 1: States based on board size, score and pieces on the board. As seen in the table a standard Kolibrat game with a 3×4 game board and a maximum of 4 pieces on the board for each player only has 8500949 states.
If each state takes up 12 bytes, this gives that all states will take up 102 MB of memory, plus some memory needed for bookkeeping. While this is still a considerable amount of memory it is well within the limits of a modern computer to work with. Given the time a computer could calculate and solve the complete game to ? nd the best possible strategy for winning. The memory needed for the bookkeeping is actually also quite a bit, assuming the computer system solving the game is a 64 bit computer like most modern computers today. A pointer takes up 64 bits of data or 8 bytes, if every state must have a pointer to all states accessible from itself this means quite a bit of extra data.
Assuming that every state has about 4 possible moves, this means that every state must have four 64 bit pointers to other states, with 8500949 this gives an additional 272 MB data to be stored in order to connect the states to each other. This adds up to a total of a little less that 400 MB for a complete solution to a kolibrat game on a 3×4 board. This number can be reduced be a factor of two by implementing an algorithm that can invert the board so red becomes black, and black red. 3. 1. GAME ANALYSIS 23 By also implementing an algorithm that can mirror the board along the y-aksis the total amount of unique states can be divided by a factor close to 2. The factor is only close to 2 because a state that is symmetric only appears once in the complete set of states, where as all non symmetric states appears twice.
With these improvements to an algorithm it will be possible to store the complete solution to the kolibrat game in a little less that 100 MB ? le. While it certainly is possible to solve the smaller Kolibrat boards completely, solving the larger is still not possible today. The game draughts have 5 · 1020 unique states and have recently been solved proving that both player have a draw strategy . It took 16 years from the project started to the proof were complete, demonstrating that solving the larger Kolibrat games is impossible unless massive computer mainframes work on the problem for yeas. For compairson chess have about 1050 unique states , and have never been solved. 3. 1. 2 Branching factor -axis A games branching factor is the factor by witch its game tree branches, in short the number of moves the player has to chose from when he makes a move. The branching factor is important since is determines the number of states an arti? cial player must look through to ? nd the best move by looking ahead in the game. If an agent has to look d moves ahead in a game with a branching factor of b to ? nd the best move, the amount of states s the agent must look through is determined by formula (3. 4). d s= i=0 bi = bd+1 ? 1 b? 1 (3. 4) In Kolibrat on a 3×4 board and with a maximum of four pieces, on the board the average branching factor has been determined to be about 3. 12 on average. This number also seems fairly constant, and di? erent playing styles has little to no e? ect on the factor.
Even even if the average branching factor seems quite constant based on several measurements with di? erent playing styles, the individual branching factors varies quite a lot from turn to turn. The minimum branching factor is zero and although this is a pretty rare situation it do appear, one example can be seen in 24 CHAPTER 3. ARTIFICIAL INTELLIGENCE ?gure 3. 1(a). The maximum branching factor has been determined to be ten and is equally unlikely to appear in a game, as only a few board positions gives a player this many choices in moving. One example of such a position can be seen on ? gure 3. 1(b). (a) No Moves for red. (b) Ten moves for red.
Figure 3. 1: Board positions. On other board sizes the average branching factor also seems pretty constant and independent of variations in the playing style. Table 3. 2 displays measured branching factors for di? erent board sizes and the highest amount pieces. Breadth 2 3 3 3 4 4 5 9 9 Height 2 4 4 4 5 5 6 9 9 Max Pieces 2 4 5 6 5 6 6 15 30 Branching factor 1. 6 3. 2 3. 2 3. 2 4. 5 5. 0 6. 0 12. 3 12. 3 Table 3. 2: Average branching factor. 3. 1. GAME ANALYSIS E? ective branching factor 25 While it is not possible to change the branching factor, as it is game speci? c, it is possible to make changes to the algorithm that examines the game tree.
These changes could allow the algorithm to discard some states before they have been examined. When this is done the e? ective branching factor that identi? es that branching factor of the states that the algorithm has to expand to ? nd the best move, becomes di? erent from the average branching factor. By sorting out states that can not possible lead to the best possible move, the algorithm can use more time to look at states that might turn out to be the best move, and in that way decrease the e? cient branching factor. A more detailed discussion on how to decrease the e? cient branching factor is described in section 3. 2. 2. 3. 1. 3 Complete analysis of Kolibrat
While games on larger boards will takes years to solve it is easy to solve some of the games on smaller boards. In a game on a 2×2 board played to 1 point and with a maximum of 2 pieces on the board for each player black has a victory strategy, the proof is shown in ? gure 3. 2. ck Bla Win in ck W Bla Figure 3. 2: Victory strategy for black on a 2×2 board. 26 CHAPTER 3. ARTIFICIAL INTELLIGENCE Even though ? gure 3. 2 is not complete with the moves that lead to a red victory all possible moves that lead to a black victory is shown. Victory strategy on a 3×3 board As with the 2×2 board, it can also be proved that red has a victory strategy on a 3×3 board, if the game is won after the ? rst goal.
The incomplete game tree on ? gure 3. 3 show only the moves that bring victory to red, but all possible black moves are shown. 1 5 5 4 3 2 5 Red Win Red Win Red Win Red Win 1 4 Red Win Red Win Red Win Red Win 2 2 1 Red Win Red Win Red Win Red Win Red Win Red 3 4 Win 3 4 Red Win Red Win Win Red Red Win Red Win Red Win Red Win Red Win Red Win Red Win Red Win Red Win Figure 3. 3: Victory strategy for red on a 3×3 board. On this board size red has the advantage. Because red it is the ? rst to move, he is also the ? rst payer that can reach the centre of the board, which equals victory on a 3×3 board. 3. 1. GAME ANALYSIS Victory strategy on other boards 27
As there is no rules in Kolibrat that allow games to end in a draw all kolibrat games must have a victory strategy for either red or black, unless the game is played in a way that makes the game go on forever. It might be that on some boards the players have the choice between playing forever or breaking the cycle and loose, but there is no proof of that. On a 3×4 game board played to one point red always wins. I have not proved that black can not win, but I have not been able to ? nd a game that ends with a black victory when both players look more than a few moves ahead in the game, but played to four points black seem to have the advantage. 3. 1. 4 Analyzing forced loops The question is now whether or not one of the players can force the game to go on forever.
In order to do this there has to be a sequence of moves that the player can choose and no matter what moves the opponent chose the game must end in a state it has previously been in. A mathematical proof of whether or not this is possible, is out of scope for this thesis. But the solution can be found by a computer using brute force calculation. The pseudo code for testing this property is not included since the pseudo code for solving this problem exceeds 50 lines of code. The complete source code for the ForcedLoop program is listed in appendix C. 5. 1. The program begins with constructing a game graph from the empty game board. All states that is found is added to a set of knownStates.
Lets assume that the program tests whether or not red player can enforce a loop. When red is moving and one of the states that red can move to is in knownStates then all the child states are discarded and red’s parent(s) is told that one of their children has a loop. Else the children is added to an activeStates array for further calculations. When parent p, where red has the turn, is told one of its children has a loop, all children of that parent(s) are told they are part of a loop and they are discarded. Now p’s parent is told is has a child with a loop and p marks itself as part of a loop. When a parent where black has the turn is informed that a child has a loop it marks that child as dead.
When all its children are dead it calls its own parent to tell that one of its children it is part of a loop, and marks itself as part of a loop. 28 CHAPTER 3. ARTIFICIAL INTELLIGENCE When black have the turn and one of its children is in knownStates then the states that are in knownStates is told they have another parent (blacks state). All other child states that is not in knownStates are added to activeStates for further calculations. In order to rule out the possibility of in? nite loops the program must run the test for both red and black player. An interesting side e? ect of the ForcedLoop program is that with only a few modi? cations the program could be modi? ed to search for victory strategies for one of the players. Instead of calling the parent when a hild had a loop the states must call the parent when it knows that a child state is a victory node for red or black. Unfortunately the programs data structures is not the 0 120 121 100 200 122 103 101 Red Win 107 108 109 110 104 105 106 Black Win 102 Black Win 111 Figure 3. 4: Arti? cial game tree from FakeLogic. most memory e? cient. The amount of ram the application uses quickly exceed several gigabytes. Although only smaller boards up to 3×3 has been tested with this program all tested boards have returned no forced loops. 3. 2. AI IN KOLIBRAT GAMES 29 To ensure that the ForcedLoop algorithm and its implementation works as intended, a class FakeLogic has been implemented and generates a game tree as the one on ? gure 3. 4.
When testing ForcedLoop on that game tree the program returns that red, but not black can enforce in in? nite loop, which is the correct answer. This do not prove that there are no errors, but the game tree in ? gure 3. 4 has many, if not all of the pitfalls, a real game tree will have. The numbers on ? gure 3. 4 represent the internal ID numbers used in the FakeLogic implementation to recognize states, and the semi-dotted lines represents a backwards loop. 3. 2 AI in Kolibrat games If you de? ne an agent as an arti? cial intelligent player that acts on behalf of a user, that isn’t there. Then this agent must attempt to solve a given problem for that user.
In this case the problem of winning in a game of Kolibrat. This problem is not exactly easy to solve, since the environment the agent operates in has multiple agents (one more) that works against it. The environment the agents is operating in is fully observable, the agents have full access to all information in the current game state. The environment is also sequential and static since the players move one after another and the kind of information that is stored in a state is consistent form state to state. When agents compete against each other in such an environment the agent usually uses knowledge of the games rules to look ahead in the game, trying to ? nd a state that ensures it victory.
The following sections discuss the MiniMax agent which works that way. 3. 2. 1 Mini-Max agent The max-min agent is a highly speci? ed agent developed especially for fully observable, static, sequential and multi agents environments. It is an old and well tested approach to solving two-player game problems. It works by constructing a full game tree down to some level. From that level the game tree is evaluated from the bottom and up. The moving 30 CHAPTER 3. ARTIFICIAL INTELLIGENCE player is de? ned as the max player, since it is his move we want to maximize. The other player is the min player since he wants to minimize the utility of the moving players ? nal move.
When the game tree is evaluated all states on the bottom level of the tree is given values by evaluation in an heuristic function, sometimes also called a utility function. This returns the utility value of each state on the bottom level. If the player one level above the bottom level is min he will choose the lowest utility value among all his children and take that value as his value. If it is max he will choose the highest utility value among all his children and take that value as his value. This continues level for level, until the top of the tree is reached, at that point the child that contains the highest utility value is selected as the best move. Assuming that the heuristic function is perfect the MiniMax agent will play perfectly, making no mistakes at all.
Unfortunately the function is usually only a crude estimate of a states real utility value. Having a good utility function is essential for the MiniMax agent, if the function is bad or even wrong the agent will perform badly compared to other agents with better utility functions, even if the agent with the bad utility function is given more time to look further ahead in the game. The heuristic function The de? nition of a heuristic function is a function that estimates the cost of the cheapest path from the current state to the a goal state [3, page 95]. Since guessing the distance to a goal state is highly dependent of the opponents playing style a utility function is used instead in multi-agent environments.
The utility function basically does the same thing, but it do not return the expected length from the current state to the goal, but the states utility value. If the utility function is correct a state that is close to winning will have i higher value, than states further away from winning. However there is no guarantee of this, since most utility values is only estimates, based on certain properties of the current state. 3. 2. 2 Optimizing the Mini-Max agent Because of the popularity of the MiniMax agent, a lot of work has gone into optimizing the original algorithm. Most of these improvements are 3. 2. AI IN KOLIBRAT GAMES 31 trade-o? s between memory usage and calculation time. Some possible optimizations is listed below. Alpha-beta pruning A simple yet e? ient way to optimize the performance of MiniMax is to implement Alpha-beta pruning. Alpha-beta pruning works by adding two variables to the each state, and only works if the MiniMax agent is implemented to use deep-? rst search. The ? rst represents the utility value of the best state the red player could have reached by taking another path in the game tree. The second the best utility value (the lowest) that black player could have reached by taking another path in the game tree. If at any point one of the players reaches a state s that is evaluated to have a better utility value for that player, but is below a state where the other player can force the game into another part of the game tree that is preferable for him.
If this happens the Alpha-beta enhancement realizes that the opponent will never allow the play to reach this part of the game tree and and that entire part of the tree is abandoned. In order to get the optimum e? ect of Alpha-beta pruning all moves must be sorted. The list of moves must be sorted so that states generated from moves expected to be good, is explored ? rst. This ensures a high probability for ? nding the best move in the ? rst try, and thus a bigger chance of reaching a state where Alpha-beta realizes that this part of the game tree can be cut-o?. An implementation of Alpha-beta pruning where the moves are sorted, will on average decrease the e? cient branching factor to the square-root of the average branching factor [3, page 169]. Implementing a hash table Another way to decrease the e? ient branching factor is to ensure that two identical states is never both explored. This puts some demands on the MiniMax implementation. First of all if two identical game states is discovered it must always be the one closest to the top of the tree, that is explored. This is not necessary the case since the MiniMax agent uses deep-? rst search. Also if two equal states is found on the same level 32 CHAPTER 3. ARTIFICIAL INTELLIGENCE only one of them should be explored. Implementing a hash table to avoid exploring equal states can lead to a signi? cant decrease of the e? cient branching factor, but the trade-o? is highly increased memory usage.
In MiniMax a state can be removed from memory when it has been evaluated, now all states must be preserved in memory until the best move has been found. Multithreaded Programming A completely di? erent way of improving the performance of MiniMax would be to give it more processing power. Since most new computers today ship with multiple processors, a serious boost in performance could be gained by taking advantage of all the processors. Implementing MiniMax in a thread safe manner will complicate the implementation, but this is also the only signi? cant disadvantage. 3. 3 Implementing the Mini-Max agent The MiniMax agent is usually implemented as a deep ? rst search, to decrease the memory requirements.
To ensure the agent returns a move within reasonable time the search is normally cut-o? at some predetermined level. In this implementation, the agent is given a speci? c amount of time to ? nd the best possible move. Since a best move can only be determined after a search to some level has been completed, the algorithm must complete at least one search in this time interval. To do this the agent uses iterative deepening search. This search continually performs deep ? rst searches, at ? rst the search is cut-o? at level 1, then the cut-o? level is increased by one until the time runs out. At that time the best move from the last completed search is returned as the best move. It may seem like a waste of alculation time to recalculate the entire game tree for each level, but the advantages really surpass the disadvantages. The calculation overhead is determined by equation (3. 5), where b is the 3. 3. IMPLEMENTING THE MINI-MAX AGENT average branching factor, and d the search depth. d? 1 i=0 d 33 bi = bi 1 bd ? 1 ? d+1 ? 1 b b (3. 5) i=0 As seen the overhead is on the whole equal to a fraction of the branching factor. In a game on a 3×4 board this gives a overhead of 33%, which is still a considerable overhead, but considering the advantages it is still the best option if the calculation is time limited. Since there is a rule in kolibrat that allows a player to move twice, some minor modi? ations had to be made to the original pseudo code for the Mini-Max agent work with Kolibrat, but essentially the implementation is equal to the original pseudo code taken from [3, page 170], but with an enhanced cut-o? test and iterative deepening search. The HASH table enhancement has also been included. The running time of the algorithm is determined by the time it takes to compute one state multiplied by the amount of states the algorithm searches. The time it takes to compute one state is constant, and when the algorithm has made a search down to level d it will have processed bd+1 d b? 1 states. This gives a total running time of O(b ). Without any enhancements b is the average branching factor, but if the agent uses Alpha-beta pruning or other enhancements b represents the e? ctive branching factor. 3. 3. 1 Additional possible Mini-Max enhancements Even though the MiniMax agent has been implemented with the enhancements speci? ed above, there is still aspects of the algorithm that could be improved. For that reason some suggestions for additional enhancements are discussed in the sections below. Randomness In some situations, especially when the MiniMax agent uses a simple heuristic function to evaluate the board, some or all of the possible moves end up with the same utility value. In this situation the agent always 34 CHAPTER 3. ARTIFICIAL INTELLIGENCE chooses the ? rst move among the moves with equal utility value.
This makes the agent deterministic and might lead to board situations where the agent always makes the wrong move. This behavior could be improved by choosing the moves at random, among the moves with the highest utility value. To make the agent completely non deterministic the agent could also use a probability function to choose its move. The utility value could be used as an indicator of how likely a move is to be chosen, meaning that the agent chooses a random move among all the possible moves, but moves with a higher utility value has a higher chance of being chosen. Board speci? c agents The heuristic function MiniMax uses now is independent of the board.
The heuristic will however not perform as well, on some boards since the optimal heuristic function varies from board to board. One way to improve the heuristic function would be to make it board speci? c. This could for instance be telling the agent that standing in that board ? eld is really good, or if the agent can force the game into this state it will win. The only problem is to ? nd ? elds on the board that is good to occupy or states that leads to victory, but this could be calculated in the same way as the heuristic function is optimized in section 3. 4. Continuous search Optimizing the search by improving the search algorithm in the MiniMax agent is another way to increase the performance of the agent.
This improvement would allow the agent to save the game tree in between moves, and save calculation time by not having to recalculate the entire tree the at every move, this could also be combined with iterative deepening search to decrease the 30% overhead. A search algorithm like this will require more memory, and some performance will be lost to keeping track of the new advanced data structures required to implement this. If implemented correctly this enhancement will save time, but the decrease in calculation time is limited. 3. 4. OPTIMIZING THE HEURISTIC FUNCTION 35 Assuming the game is played on a 3×4 board with an average branching factor b of 3. , and the agent have constructed a game tree. When the agent has found his move 1 of the tree can be neglected, and when the b opponent has made his move another 1 of the tree can be neglected. this b gives that only b1 of the entire tree is still useable when the agent regains 2 the turn. With b = 3. 2 this gives that only 9. 8% of the tree don’t have to be recalculated at the start of next turn, and on bigger boards this number is even smaller. The real improvement is achieved by combining this search algorithm with the iterative deepening search, which will reduce the calculation overhead from a fraction of the branching factor to zero. 3. 4 Optimizing the Heuristic function
The heuristic function is in many ways the most important component in the MiniMax agent. If the heuristic is bad, or makes plain out wrong assumptions about the game states utility the agent will perform bad compared to agents with better heuristics, even if they have more time to analyze the game board. There is di? erent approaches to optimizing the heuristic function. One approach would be to use a neural network as the evaluation function, another to use a weighted linear evaluation function. The advantages and disadvantages of both approaches is described in the next section. 3. 4. 1 Neural network utility function A neural network is a network of connected mathematical functions.
The network takes a set of input parameters and returns a set of output parameters. The input could be a game state and the output could be the states utility value. The disadvantage of using neural networks as a utility function is that after the neural network is implemented it must be trained to return a correct utility value. There is more than one algorithm that can train a neural network but they all have one thing in common. They take a neural 36 CHAPTER 3. ARTIFICIAL INTELLIGENCE network, a set of states and their correct utility value as arguments. The algorithm now manipulates the network to provide the correct output to the input parameters.
On problem with this approach is that generating a set of input states and ? nding the correct output (the utility value) can be di? cult, since there are no simple way to determine the correct utility value for a state. The advantages of using neural networks to evaluate a state is that the evaluation is based completely on computer generated information, about how good and less good states look like. Almost all other approaches to generating a heuristic function, involves some form of human reasoning about what is important and what is not. 3. 4. 2 Weighted linear evaluation function A weighted linear evaluation function takes a set of functions fi (s), that ach represent a property the evaluation function takes into account. If s is the state and wi is the weight for fi (s) then a weighted linear evaluation function can be de? ned as in (3. 6), where n is the number of properties from the state that the function takes into account. n Eval = i=1 wi fi (s) (3. 6) This approach is often used to construct evaluation functions since it is simple and ? exible. The only problem is to determine the functions and their weight, but there are methods for that. Kolibrat is a zero-sum game, meaning that the sum of the players scores are always zero. When one player makes a move that increases his chance of wining, his opponents chances of winning is decreased.
This increases the ? rst players utility value, and decreases the opponents. The player closest to winning will always have a positive score, and his opponent will have a negative score, of the same value. This property makes is easy for the agents using the function to analyze how they are doing compared to their opponent. 3. 4. OPTIMIZING THE HEURISTIC FUNCTION 37 3. 4. 3 Choosing the heuristic parameters Choosing the best possible combination of parameters for the heuristic function is not possible for humans, as they can only make educated guesses about whether or not a parameter is important, and they might overlook important parameters.
The only way of generating parameters for a heuristic function is by using algorithms that can transform problems speci? ed in logic languages into a set of heuristic parameters. The best known program to de this is Armand E. Prieditis program ABsovler. It takes a problem formulated in a logic language like 1. order logic and returns a heuristic function for this problem. Using the ABsovler to determine the best parameters for a evaluations function for Kolibrat would be the optimal solution, but it have some major drawbacks. In order to use the ABsovler it must ? rst be implemented and this seems like a task worthy of its own thesis, judging by the available information about the ABsovler.
Also according to  ABsovler has only been used to solve simple problems in static, single agent environments like the Eight Puzzle, Rubik’s cube or the eight queens problem so hoping it can discover a heuristics for a game like Kolibrat somehow seen optimistic. Chosen parameters given the circumstances the ABsovler seems like an unrealistic approach to ? nding the parameters for the heuristic evaluation function. The alternative approach is to lets humans choose the parameters for the heuristics and then determine their importance by other means. The most important property of the parameter for the heuristic is that they somehow increases when the player is getting closer to winning, and decreases when the player is getting closer to loosing.
After analyzing the game, I have come up with the following possible parameters to include in a heuristic function. 1. Value of a piece on the line of insertion. 2. The value increment when a piece gets one ? eld closer to the opponents home line. 3. The added value when a piece is in the middle of the board. 38 CHAPTER 3. ARTIFICIAL INTELLIGENCE 4. The penalty for having a piece standing in front of the opponent in his turn. 5. The added value to have two pieces standing in a row. 6. An added value for the number of legal moves the player can make. 7. Value of a scored point. 8. The value of having the turn. 9. The value of being able to insert pieces on the board. 10.
The value of having a piece standing on the opponents home line. 11. The value of being the player having most pieces on the board. 3. 4. 4 Determining the parameters weight To use these parameters in an weighted linear evaluation function, the weight of each parameter must ? rst be determined. There are a few di? erent methods that is suitable for determining the value of the di? erent parameters. The most known of these are genetic algorithms and simulated annealing, both of these will be described below. Genetic algorithms Genetic algorithms works by generating an initial population of heuristic evaluation functions. The functions is now ranked from the most ? t to the least ? individual, by using the functions in actual game play. The best heuristic functions are then chosen as parents for the next generation of heuristic evaluation functions. This continues until there hasn’t been considerable improvements over the last few generations. To generate a new generation of heuristic evaluation functions the parents are put through a mutation process, where some of its weight values are changed at random. After that the parents are mated with each other to produce a new generation. This works by randomly swapping weight values from the parents onto a new heuristic evaluation functions that is part of the next generation.
The swapping are done in a way that ensures that strongly related weight values values almost always come from the 3. 5. SIMULATED ANNEALING IMPLEMENTATION same parent. 39 While the genetic algorithm certainly will be able to ? nd optimal values for the di? erent parameters weights, is might not be the best approach. The genetic algorithms biggest strength comes from its intelligent crossings between related and unrelated properties in the heuristic evaluation function. But in Kolibrat there is no clear way to determine which properties is strongly related, and which is not. Simulated annealing In many ways simulated annealing works in the same way as the genetic algorithm, but when a new generation is made only the mutation process is carried out.
Also all the parameters is mutated at once, not just a selected few. This algorithm has the advantage compared to genetic algorithms that it requires no knowledge about the internal relations between the evaluation functions parameters and their weights. Aside from this both algorithms should be able to reach the same result, but maybe not in the same amount of time. 3. 5 Simulated annealing Implementation Due to its lack of pre-required knowledge of the internal relations between the evaluation functions parameters and their weight, the algorithm for simulated annealing will be used to determine the weights of the heuristic functions parameters.
In the original pseudo code [3, page 116] the simulated annealing algorithm only kept one heuristic function in its population until a new function defeated it and took its place. In this implementation the algorithm has been changed to keep 60 heuristic functions, then choose the 30 best, mutate them and add both the mutants and the non-mutated 30 functions to the next population. The altered pseudo code is shown in algorithm 1, and the source code can be seen in appendix C. 4. 1. The function RandomHeuristicValues generates random weights between 0 and 1000 for all the parameters the algorithm it is trying to 40 CHAPTER 3. ARTIFICIAL INTELLIGENCE Algorithm 1 SimAnnealing 1: rand winners ++ 17: if winners = 30 then 18: for all h heuristics in victoryArray do 19: new 3. 5.
SIMULATED ANNEALING IMPLEMENTATION 41 optimize. The Rand function takes two numbers as arguments and generates a random number in between these. FindWinner takes two heuristics and loads these into two players. The two players then battle each other to ? nd the best heuristic. If the game is not over in four minutes the FindWinner terminates the game, and chooses a random winner. The NewHeuristicFrom function takes a heuristic h and the current random factor rand as inputs and generates a new heuristic. The pseudo code for the NewHeuristicFrom is showed in algorithm 2. Algorithm 2 NewHeuristicFrom 1: input: h 2: input: rand 3: new 42 CHAPTER 3. ARTIFICIAL INTELLIGENCE 3. 5. 1