Minimax Algorithm and Implementation
Minimax is a decision rule which simulate the decision of a player to find the optimal move for the player. It assumes that the opponent will play with optimal choice too. There are two actors in the Minimax. It’s maximizer and minimizer. The maximizer will search the highest score as possible and the minimizer will search the lowest score as possible. Each players will see the opponent’s move best move and get the best one for him/her. Usually, the searching is represented in tree structure data.
Here’s the example how the Minimax algorithm
[Image 1] Source: AI: A Modern Approach Book
Look at Image 1, Max will output B,C,and D actions. First, the minimizer will see the possible action resulted in B . Get the lowest score
min(3,12,8) = 3. Then do that to C and D. After that, the Maximizer will pick the highest score. Which is
max(3,2,2) = 3 .
At our board game, the algorithm will not expand the state/node until terminal state. It will have memory issue or takes very long time to process. We should define a cut-off to our search algorithm by defining the
max_depth that limit the node that should be expanded. It will call evaluation function that will calculate the heuristic of the condition.
Let’s implement it into our EvoPawness (Temporary Name):
We will define a class for Minimax algorithm that receive
player_color as its input. The
player_color is a parameter to points out who is the maximizer here.
Let’s see the Minimax class:
As you can see, the Minimax Agent will do Depth First Search with recursive search. It will output the score (
best_value) and the best action action key name. It will take the largest score if it’s the maximizer’s turn and take the smallest score if it’s the minimizer’s turn. It will search recursively until terminal state or maximum depth is reached.
We shuffle the list to make the agent choose action with the same score randomly.
If you are wondering the type of the
str type. Remember that our game action is a
dict type which save the information of the action and the action key name. For example:
Test The Agent
We will play with the
MinimaxAgent with depth = 4
Go to line 52 in controller/game_controller.py and change it into
self.ai_agent = MinimaxAgent(max_depth=4, player_color=0)
[Image 2] I exit the game because I cannot stand waiting the AI’s move
Run the game with your CLI
The AI will prioritize reaching the rune first. Then, activating its pawn. After that attacking our pawn first. If possible, the AI will evolve its pawn into higher type. Of course there’s some flaw such as let our pawn attacking their king. Maybe there is a better evaluation function that can make the AI protect the king.
If you’ve tried the AI, you will see that the consumed time on choosing the action is very long. Let’s see:
As we can see above, as the game progressed, the expanded_node is also increasing because of the number of possible actions. It tends to rise the time of the algorithm as the expanded node increasing. We can see the time on turn 24 is 21 second. Who wants to wait that long?
This is the weakness of this algorithm as it will search all of possible state until the desired cut-off or terminal state. Fortunately, it’s possible that we can effectively cut the expanded node. We will be using a technique called pruning to compute the correct Minimax decision without looking at every node. It’s called alpha-beta pruning.