Minimax / Alpha Beta for Android Reversi Game

I have to implement a Reversi game for Android. I have managed to implement all the game, is functional, but the problem is that I don't have an AI. In fact, at every move the computer moves in the position that achieves him the highest number of pieces.

I decided to implement and alpha-beta pruning algorithm. I did a lot of research on the internet about it, but I couldn't come to a final conclusion how to do it. I tried to implement a few functions, but I couldn't achieve the desired behaviour.

My board is stored in class Board (inside this class, the pieces occupied by each player are stored in a bi-dimensional int array). I have attached an small diagram (sorry about the way it looks).

DIAGRAM: https://docs.google.com/file/d/0Bzv8B0L32Z8lSUhKNjdXaWsza0E/edit

I need help to figure out how to use the minimax algorithm with my implementation.

What I understood so far, is that I have to make an evaluation function regarding the value of the board.

To calculate the value of the board I have to account the following elements: -free corners (my question is that I have to take care only about the free corners, or the one that I can take at the current move?! dilemma here). -mobility of the board: to check the number of pieces that will be available to move, after the current move. -stability of the board… I know it means the number of pieces that can't be flipped on the board. -the number of pieces the move will offer me

I have in plan to implement a new Class BoardAI that will take as an argument my Board object and the dept.

Can you please tell me a logical flow of ideas how I should implement this AI? I need some help about the recursion while calculating in dept and I don't understand how it calculates the best choice.

Thank you!


First you can check this piece of code for a checkers AI that I wrote years ago. The interesting part is the last function ( alphabeta ). (It's in python but I think you can look at that like pseudocode).

Obviously I cannot teach you all the alpha/beta theory cause it can be a little tricky, but maybe I can give you some practical tips.

Evaluation Function

This is one of the key points for a good min/max alpha/beta algorithm (and for any other informed search algorithm). Write a good heuristic function is the artistic part in AI development. You have to know well the game, talk with expert game player to understand which board features are important to answer the question: How good is this position for player X?

You have already indicated some good features like mobility, stability and free corners. However note that the evaluation function has to be fast cause it will be called a lot of times.

A basic evaluation function is

H = f1 * w1 + f2 * w2 + ... + fn * wn

where f is a feature score (for example the number of free corners) and w is a corresponding weight that say how much the feature f is important in the total score.

There is only one way to find weights value: experience and experiments. ;)

The Basic Algorithm

Now you can start with the algorithm. The first step is understand game tree navigation. In my AI I've just used the principal board like a blackboard where the AI can try the moves.

For example we start with board in a certain configuration B1 .

Step 1: get all the available moves. You have to find all the applicable moves to B1 for a given player. In my code this is done by self.board.all_move(player) . It returns a list of moves.

Step 2: apply the move and start recursion. Assume that the function has returned three moves ( M1 , M2 , M3 ).

  • Take the first moves M1 and apply it to obtain a new board configuration B11.
  • Apply recursively the algorithm on the new configuration (find all the moves applicable in B11, apply them, recursion on the result, ...)
  • Undo the move to restore the B1 configuration.
  • Take the next moves M2 and apply it to obtain a new board configuration B12.
  • And so on.
  • NOTE: The step 3 can be done only if all the moves are reversible. Otherwise you have to find another solution like allocate a new board for each moves.

    In code:

    for mov in moves :
        self.board.apply_action(mov)
        v = max(v, self.alphabeta(alpha, beta, level - 1, self._switch_player(player), weights))
        self.board.undo_last()
    

    Step 3: stop the recursion. This three is very deep so you have to put a search limit to the algorithm. A simple way is to stop the iteration after n levels. For example I start with B1 , max_level=2 and current_level=max_level .

  • From B1 (current_level 2) I apply, for example, the M1 move to obtain B11.
  • From B11 (current_level 1) I apple, for example, the M2 move to obtain B112.
  • B122 is a "current_level 0" board configuration so I stop recursion. I return the evaluation function value applied to B122 and I come back to level 1.
  • In code:

    if level == 0 :
        value = self.board.board_score(weights)
        return value
    

    Now... standard algorithm pseudocode returns the value of the best leaf value. Bu I want to know which move bring me to the best leaf! To do this you have to find a way to map leaf value to moves. For example you can save moves sequences: starting from B1, the sequence (M1 M2 M3) bring the player in the board B123 with value -1; the sequence (M1 M2 M2) bring the player in the board B122 with value 2; and so on... Then you can simply select the move that brings the AI to the best position.

    I hope this can be helpful.

    EDIT: Some notes on alpha-beta . Alpha-Beta algorithm is hard to explain without graphical examples. For this reason I want to link one of the most detailed alpha-beta pruning explanation I've ever found: this one . I think I cannot really do better than that. :)

    The key point is: Alpha-beta pruning adds to MIN-MAX two bounds to the nodes. This bounds can be used to decide if a sub-tree should be expanded or not.

    This bounds are:

  • Alpha: the maximum lower bound of possible solutions.
  • Beta: the minimum upper bound of possible solutions.
  • If, during the computation, we find a situation in which Beta < Alpha we can stop computation for that sub-tree.

    Obviously check the previous link to understand how it works. ;)

    链接地址: http://www.djcxy.com/p/56306.html

    上一篇: Beta修剪递归返回

    下一篇: 适用于Android Reversi游戏的Minimax / Alpha Beta