Optimizing decision-making with the Minimax AI algorithm

Delivering efficient AI!

Let’s get you introduced to the Minimax algorithm, and then explain some known optimizations, and then some lesser known ones. This algorithm is useful in decision-making AI, which is used in popular game engines, like Stockfish for Chess. A major limitation of Minimax is that it is only used in two-player games.

Algorithm basics

The Minimax algorithm can be thought of the computer playing against itself to find the best move! It follows the human thought process — if I do this move, what moves will my opponent have, then what moves can I play, and so on!

  • A naive implementation would keep on building the possible-moves tree until each “path” concludes into a win/loss. This isn’t practically possible because the size of the tree increases exponentially with each increment in search depth.
  • To solve this problem, a heuristic function is involved. An analogy to a heuristic function is a human’s evaluation of the board after thinking 5 moves ahead. A heuristic function (say h) returns a numerical value that predicts in whose favor the board is in. For example, if h is greater than zero that means player A is winning, and if it is negative then player B is winning.

The heuristic function is a way to inform the search about the direction to a goal. — artint.info/html/ArtInt_56.html

  • Let’s say you make the computer look three moves ahead. Then it will evaluate the board using the heuristic function. This value will be used to find the best possible move.

Implementation

The Minimax algorithm is built using indirect recursion. We need to implement five entities:

  • Heuristic
  • Maximizer and Minimizer (see where Minimax comes from): The maximizer is the player who wants the heuristic to be greater and the minimizer wants it to be less. The maximizer/minimizer functions will search for the move that makes the heuristic the greatest/least using the evaluator (incrementing the current depth).
  • Evaluator: The evaluator function takes two basic parameters: the search-depth and the current-board (after the moves considered take place). This function returns the heuristic after the maximum search depth has been attained — which means that it calls the maximizer/minimizer depending on whose turn it is for the current depth. If the search depth becomes equal to the limit set, it will return the heuristic function’s evaluation and stop recursion there.
  • Moves Provider: Although listed last, this is the most fundamental. It provides a list of moves a player can make for a given board-state.

The explanation given is a bit confusing. All of it will clear up with an example of tic-tac-toe!

Case study: tic-tac-toe

We are going to implement the Minimax algorithm for a tic-tac-toe application. Before doing that, we need to define how we will represent our board (using ECMAScript here) and how we are going to represent moves.

// initial board
const board = [
[null, null, null],
[null, null, null],
[null, null, null]
];
  • Representing the board: The tic-tac-toe is a 3×3 grid of X’s and O’s. Hence, we create a 2-D array that stores null for empty squares, 'X' for X’s, and 'O' for O’s.
  • Representing moves: Tic-tac-toe moves can be represented by a target square and the moving piece. This is because the letters cannot be moved after they are placed initially.
// X placed in the center @(1,1); remember not (2,2)
{ player: 'X', targetX: 1, targetY: 1 }

Writing the actual algorithm

  1. Most fundamental first, the “Moves Provider
function getAllMoves(player, board) {
let movesList = [];
 for (let r = 0; r < 3; r++) {
for (let c = 0; c < 3; c++) {
if (board[r][c] === null) { // empty square!!!
movesList.push({ player: player, targetX: r, targetY: c })
}
}
}
 return movesList;
}

2. The heuristic function (a little homework for you)

function heuristic(board) {
// search all three rows - is any one filled by only one player
// search all three columns - is any one filled by one player
// search the two diagonals - is any one filled by one player
 // if any row, column, or diagonal is filled solely by one
// player - then return 100 if O and -100 if X.
 // otherwise, always return 0.
}

3. Defining the maximizer/minimizer

function maximizer(board, depth) {}
function minimizer(board, depth) {}
// we'll implement them shortly

4. Evaluator

const depthLimit = 3;// search upto three moves ahead
function eval(board, player, depth=0) { // depth is 0 initially
if (depth >= depthLimit)
return heuristic(board);
 if (player === 'X')
return maximizer(board, depth);
else
return minimizer(board, depth);
}

5. Implementing the maximizer/minimizer

function copyBoard(board) {// simple helper function
let copy = new Array(3);
copy[0] = [board[0][0], board[0][1], board[0][2]];
copy[1] = [board[1][0], board[1][1], board[1][2]];
copy[2] = [board[2][0], board[2][1], board[2][2]];
return copy;
}
function applyMove(board, move) {// simple helper function
board[move.targetX][move.targetY] = move.player;
return board;
}
var bestMoveStore;// will be set when best move is found @depth=0
function maximizer(board, depth) {
let movesList = getAllMoves('O', board);
let bestMove, bestMoveScore = Number.NEGATIVE_INFINITY;
// -INFINITY because first move will always be more
 for (let i = 0; i < movesList.length; i++) {
let movedBoard = applyMove(copyBoard(board), movesList[i]);
 let moveScore = eval(movedBoard, 'X', depth+1);
if (moveScore >= bestMoveScore) {
bestMove = movesList[i];
bestMoveScore = moveScore;
}
}
 if (depth === 0)
bestMoveStore = bestMove;
}
// Similary minimizer can be implemented, just you find the
// bestMoveScore as the least value instead of the max.
function minimizer(board, depth) {
let movesList = getAllMoves('O', board);
let bestMove, bestMoveScore = Number.POSITIVE_INFINITY;
// +INFINITY because the first score will always be less
 for (let i = 0; i < movesList.length; i++) {
let movedBoard = applyMove(copyBoard(board), movesList[i]);
 let moveScore = eval(movedBoard, 'X', depth+1);
if (moveScore >= bestMoveScore) {
bestMove = movesList[i];
bestMoveScore = moveScore;
}
}
 if (depth === 0)
bestMoveStore = bestMove;
}

5. Wrapper

We need an API wrapper that will return the variable bestMoveStore to the client code/UI that actually needs the computer’s result. This method needs to call eval with depth zero which will eventually call maximizer or minimizer.

export function analyzeAndMoveAsComputer(board, player) {
eval(board, player, 0);
return bestMoveStore;
}

I hope this “code-based” review of the algorithm clear stuff up a little.

Where’s the tree of possible moves?

The code above doesn’t construct a tree of possible moves and their results. It uses recursion and checks each case/path-of-moves one-by-one without saving them. It only retains the best-possible move at each level. This saves a lot of memory-usage but fails to account for the efficiency of CPU computation.

Known optimizations to the Minimax algorithm

Negamax

In the algorithm’s basic implementation the maximizer/minimizer function essentially does the same thing — find the move with the best score — except look in the opposite directions.

The negamax optimization does away with this and combines the three functions — eval , maximizer , and minimizer — into one function, which can be combined into one API wrapper function (doing away with a separate wrapper too!)

The code will look something like this:

function dir(player) {
return (player === 'O') ? +1 : -1;
}
function analyzeAndMoveAsComputer(board, player, depth=0) {
// NOTE: depth=0 means that depth is an optional arg for
// the client. It is meant only for internal use and not
// for the caller.
 if (depth === depthLimit) {
return heuristic(board) * dir(player);// always positive
}
 let movesList = getAllMoves(player, board);
let bestMove, bestMoveScore = Number.NEGATIVE_INFINITY;
 for (let i = 0; i < movesList.length; i++) {
let movedBoard = applyMove(copyBoard(board), movesList[i]);
 let moveScore = analyzeAndMoveAsComputer(board, // opp. player
(player === 'O') ? 'X' : 'O');
// everything is almost same till here!
 moveScore *= -1;// moveScore is for opponent!
if (bestScore > bestMoveScore) {
bestMove = movesList[i];
bestMoveScore = bestScore;
}
}
 if (depth === 0)
return bestMove;// for actual caller
else
return bestMoveScore;
}

It is hard to wrap your head around the clever usage of -1 by multiplying it with the score every time. This causes the score to always to be alternating signs up the moves tree.

This algorithm relies on the fact that max(a,b) = −min(− a,−b) to simplify the implementation of the minimax algorithm — Wikipedia

Alpha-beta pruning

This algorithm keeps two values alpha and beta. Alpha is the minimum value the “maximizer” or first-player is assured of. Beta is the maximum value that the “minimizer” or second-player is assured of.

The meaning of these values can be explained as: in ideal play, the played moves will be such that the score is greater than alpha and less than beta. While searching, if beta becomes greater than alpha (hence, the ideal play doesn’t exist), that means the opposing player won’t make one of the moves being searched (unless a mistake is made), hence it doesn’t have to be searched.

Jez9999 at English Wikipedia: https://commons.wikimedia.org/wiki/File:AB_pruning.svg

The diagram above shows which moves can be discarded from consideration.

  • The boxes represent the board when it’s the maximizer’s turn. He wants the score to increase.
  • The circles represent the board when it’s the minimizer’s turn. He wants the score to decrease.

For example, in the right-most part, look where the sub-tree 8 is discarded. This is because the minimizer can make a move with a lower score of 5 till the depth-limit, and will because it’s a better move.

Less implemented optimizations

Variable search depth

I’ve implemented this in my Android app — Surakarta. You can implement variable search depth by keeping more than one depth (and corresponding depth-limits).

I used two depths in my game — absolute search-depth and adjusted search-depth. The absolute search-depth is the “real” depth at which we are and has a higher limit.

The adjusted search-depth has a lower limit. However, instead of incrementing the depth at every level, it may remain constant if an important change occurs in the move — like a capture. Capturing moves are given more attention. The high limit on the absolute search-depth still stops the program from taking a ridiculous amount of time to complete.

Transposition tables

When I play chess, and my opponent plays the move I anticipated while making my move, I don’t have to think a lot. I’ve already done the research on the arising position, right!

However, the computer algorithm starts over. A transposition table solves this problem by the storing moves in a tree structure. If the move that the player was already searched through, then the sub-tree could be extracted. Then the leaf-nodes could be extended through.

Something I’d like to add in the algorithm one day!

I think a level concurrency would immensely reduce the computation time! Just as multiple threads can simultaneously divide-and-conquer sorting & searching algorithms and then merge the results — multiple worker threads should be able also to build the search-tree simultaneously.

I’ll write another article once I implement this feature for sure. Please stay tuned till then!!!

The Minimax is primarly used in chess engines like Stockfish! It can be used in almost any two-player board game!

read original article at https://hackernoon.com/optimizing-decision-making-with-the-minimax-ai-algorithm-ecf7062750df?source=rss——artificial_intelligence-5