Implementing Alpha Beta into Minimax
I'm trying to add Alpha Beta pruning into my minimax, but I can't understand where I'm going wrong.
At the moment I'm going through 5,000 iterations, where I should be going through approximately 16,000 according to a friend. When choosing the first position, it is returning -1 (a loss) whereas it should be able to definitely return a 0 at this point (a draw) as it should be able to draw from an empty board, however I can't see where I'm going wrong as I follow my code it seems to be fine
Strangely if I switch returning Alpha and Beta inside my checks (to achieve returning 0) the computer will attempt to draw but never initiate any winning moves, only blocks
My logical flow
If we are looking for alpha: If the score > alpha, change alpha. if alpha and beta are overlapping, return alpha
If we are looking for beta: If the score < beta, change beta. if alpha and beta are overlapping, return beta
Here is my Recursive call
int MinimaxAB(TGameBoard* GameBoard, int iPlayer, bool _bFindAlpha, int _iAlpha, int _iBeta)
{
//How is the position like for player (their turn) on iGameBoard?
int iWinner = CheckForWin(GameBoard);
bool bFull = CheckForFullBoard(GameBoard);
//If the board is full or there is a winner on this board, return the winner
if(iWinner != NONE || bFull == true)
{
//Will return 1 or -1 depending on winner
return iWinner*iPlayer;
}
//Initial invalid move (just follows i in for loop)
int iMove = -1;
//Set the score to be instantly beaten
int iScore = INVALID_SCORE;
for(int i = 0; i < 9; ++i)
{
//Check if the move is possible
if(GameBoard->iBoard[i] == 0)
{
//Put the move in
GameBoard->iBoard[i] = iPlayer;
//Recall function
int iBestPositionSoFar = -MinimaxAB(GameBoard, Switch(iPlayer), !_bFindAlpha, _iAlpha, _iBeta);
//Replace Alpha and Beta variables if they fit the conditions - stops checking for situations that will never happen
if (_bFindAlpha == false)
{
if (iBestPositionSoFar < _iBeta)
{
//If the beta is larger, make the beta smaller
_iBeta = iBestPositionSoFar;
iMove = i;
if (_iAlpha >= _iBeta)
{
GameBoard->iBoard[i] = EMPTY;
//If alpha and beta are overlapping, exit the loop
++g_iIterations;
return _iBeta;
}
}
}
else
{
if (iBestPositionSoFar > _iAlpha)
{
//If the alpha is smaller, make the alpha bigger
_iAlpha = iBestPositionSoFar;
iMove = i;
if (_iAlpha >= _iBeta)
{
GameBoard->iBoard[i] = EMPTY;
//If alpha and beta are overlapping, exit the loop
++g_iIterations;
return _iAlpha;
}
}
}
//Remove the move you just placed
GameBoard->iBoard[i] = EMPTY;
}
}
++g_iIterations;
if (_bFindAlpha == true)
{
return _iAlpha;
}
else
{
return _iBeta;
}
}
Initial call (when computer should choose a position)
int iMove = -1; //Invalid
int iScore = INVALID_SCORE;
for(int i = 0; i < 9; ++i)
{
if(GameBoard->iBoard[i] == EMPTY)
{
GameBoard->iBoard[i] = CROSS;
int tempScore = -MinimaxAB(GameBoard, NAUGHT, true, -1000000, 1000000);
GameBoard->iBoard[i] = EMPTY;
//Choosing best value here
if (tempScore > iScore)
{
iScore = tempScore;
iMove = i;
}
}
}
//returns a score based on Minimax tree at a given node.
GameBoard->iBoard[iMove] = CROSS;
Any help regarding my logical flow that would make the computer return the correct results and make intelligent moves would be appreciated
Does your algorithm work perfectly without alpha-beta pruning? Your initial call should be given with false
for _bFindAlpha
as the root node behaves like an alpha node, but it doesn't look like this will make a difference:
int tempScore = -MinimaxAB(GameBoard, NAUGHT, false, -1000000, 1000000);
Thus I will recommend for you to abandon this _bFindAlpha
nonsense and convert your algorithm to negamax. It behaves identically to minimax but makes your code shorter and clearer. Instead of checking whether to maximize alpha or minimize beta, you can just swap and negate when recursively invoking (this is the same reason you can return the negated value of the function right now). Here's a slightly edited version of the Wikipedia pseudocode:
function negamax(node, α, β, player)
if node is a terminal node
return color * the heuristic value of node
else
foreach child of node
val := -negamax(child, -β, -α, -player)
if val ≥ β
return val
if val > α
α := val
return α
Unless you love stepping through search trees, I think that you will find it easier to just write a clean, correct version of negamax than debug your current implementation.
链接地址: http://www.djcxy.com/p/56340.html上一篇: minimax算法返回与alpha beta修剪不同的值
下一篇: 将Alpha Beta纳入Minimax