在TicTacToe minimax算法中实现alpha beta修剪
在我的方法newminimax49我有一个minimax算法,利用memoization和其他一般的改进,这在这篇文章中建议给我。 该方法使用简单的启发式板评估函数。 我的问题基本上是关于alpha beta修剪,即我的minimax方法是否使用alpha beta修剪。 据我所知,我相信它确实如此,但是我用来实现它的过程似乎太简单而不真实。 此外,其他人建议我使用alpha beta修剪,正如我所说,我曾认为我的minimax方法已经做了,这使我相信我在这里做的是别的。 所以这里是我的newminimax49:
//This method returns a 2 element int array containing the position of the best possible
//next move and the score it yields. Utilizes memoization and supposedly alpha beta
//pruning to achieve better performance. Alpha beta pruning can be seen in lines such as:
/*if(bestScore==-10)
break;*/
//This basically means that if the best score achieved is the best possible score
//achievable then stop exploring the other available moves. Doing thing I believe
//I'm applying the same principle of alpha beta pruning.
public int[] newminimax49(){
int bestScore = (turn == 'O') ? +9 : -9; //X is minimizer, O is maximizer
int bestPos=-1;
int currentScore;
//boardShow();
String stateString = "";
for (int i=0; i<state.length; i++)
stateString += state[i];
int[] oldAnswer = oldAnswers.get(stateString);
if (oldAnswer != null)
return oldAnswer;
if(isGameOver2()!='N'){
//s.boardShow();
bestScore= score();
}
else{
//s.boardShow();
int i=0;
for(int x:getAvailableMoves()){
if(turn=='X'){ //X is minimizer
setX(x);
//boardShow();
//System.out.println(stateID++);
currentScore = newminimax49()[0];
revert(x);
if(i==0){
bestScore = currentScore;
bestPos=x;
if(bestScore==-10)
break;
}
else if(currentScore<bestScore){
bestScore = currentScore;
bestPos=x;
if(bestScore==-10)
break;
}
}
else { //O is maximizer
setO(x);
//boardShow();
//System.out.println(stateID++);
currentScore = newminimax49()[0];
revert(x);
//boardShow();
if(i==0){
bestScore = currentScore;
bestPos=x;
if(bestScore==10)
break;
}
else if(currentScore>bestScore){
bestScore = currentScore;
bestPos = x;
if(bestScore==10)
break;
}
}
i++;
}
}
int[] answer = {bestScore, bestPos};
oldAnswers.put (stateString, answer);
return answer;
}
在我的类State2中使用的字段和构造函数:
private char [] state; //Actual content of the board
private char turn; //Whose turn it is
private Map<String,int[]> oldAnswers; //Used for memoization. It saves every state along with the score it yielded which allows us to stop exploring the children of a certain node if a similar node's score has been previously calculated. The key is the board state(i.e OX------X for example), the int array is a 2 element array containing the score and position of last placed seed of the state.
private Map<Integer, int []> RowCol; //A mapping of positions from a board represented as a normal array to a board represented as a 2d array. For example: The position 0 maps to 0,0 on a 2d array board, 1 maps to 0,1 and so on.
private static int n; //Size of the board
private static int stateID; //An simple incrementer used to show number of recursive calls in the newminiax49 method.
private static int countX, countO; //Number of placed Xs and Os
private static int lastAdded; //Position of last placed seed
private char [][] DDState; //A 2d array representing the board. Contains the same values as state[]. Used for simplicity in functions that check the state of the board.
public State2(int n){
int a=0;
State2.n=n;
state=new char[n*n];
RowCol=new HashMap<Integer, int []>();
countX=0;
countO=0;
//Initializing the board with empty slots
for(int i = 0; i<state.length; i++){
state[i]='-';
}
//Mapping
for(int i=0; i<n; i++){
for(int j=0; j<n; j++){
RowCol.put(a, new int[]{i, j});
a++;
}
}
a=0;
DDState=new char[n][n];
//Initializing the 2d array with the values from state[](empty slots)
for(int i=0; i<n; i++){
for(int j=0; j<n; j++){
DDState[i][j]=state[a];
a++;
}
}
oldAnswers = new HashMap<String,int[]>();
}
补充方法:
getAvailableMoves返回一个数组,其中包含板上的空槽(即可能的下一步移动)。
public int[] getAvailableMoves(){
int count=0;
int i=0;
for(int j=0; j<state.length; j++){
if(state[j]=='-')
count++;
}
int [] availableSlots = new int[count];
for(int j=0; j<state.length; j++){
if(state[j]=='-')
availableSlots[i++]=j;
}
return availableSlots;
}
isGameOver2()只是检查棋盘的当前状态,判断游戏是否结束。 返回一个字符'X','O','D'和'N',分别代表X赢了,O赢了,Draw和Not gameover。
public char isGameOver2(){
char turnOpp;
int count;
if(turn=='X'){
count=countO;
turnOpp='O';
}
else {
count=countX;
turnOpp='X';
}
if(count>=n){
//^No win available if each player has less than n seeds on the board
//Checking begins
//DDState[RowCol.get(lastAdded)[0]][RowCol.get(lastAdded)[1]]=turn;
//Check column for win
for(int i=0; i<n; i++){
if(DDState[i][RowCol.get(lastAdded)[1]]!=turnOpp)
break;
if(i==(n-1)){
//DDState[RowCol.get(x)[0]][RowCol.get(x)[1]]='-';
return turnOpp;
}
}
//Check row for win
for(int i=0; i<n; i++){
if(DDState[RowCol.get(lastAdded)[0]][i]!=turnOpp)
break;
if(i==(n-1)){
//DDState[RowCol.get(x)[0]][RowCol.get(x)[1]]='-';
return turnOpp;
}
}
//Check diagonal for win
if(RowCol.get(lastAdded)[0] == RowCol.get(lastAdded)[1]){
//we're on a diagonal
for(int i = 0; i < n; i++){
if(DDState[i][i] != turnOpp)
break;
if(i == n-1){
//DDState[RowCol.get(x)[0]][RowCol.get(x)[1]]='-';
return turnOpp;
}
}
}
//check anti diagonal
for(int i = 0; i<n; i++){
if(DDState[i][(n-1)-i] != turnOpp)
break;
if(i == n-1){
//DDState[RowCol.get(x)[0]][RowCol.get(x)[1]]='-';
return turnOpp;
}
}
//check for draw
if((countX+countO)==(n*n))
return 'D';
}
return 'N';
}
boardShow,返回板的当前状态的矩阵显示:
public void boardShow(){
if(n==3){
System.out.println(stateID);
for(int i=0; i<=6;i+=3)
System.out.println("["+state[i]+"]"+" ["+state[i+1]+"]"+" ["+state[i+2]+"]");
System.out.println("***********");
}
else {
System.out.println(stateID);
for(int i=0; i<=12;i+=4)
System.out.println("["+state[i]+"]"+" ["+state[i+1]+"]"+" ["+state[i+2]+"]"+" ["+state[i+3]+"]");
System.out.println("***********");
}
}
分数,是一个简单的评估函数,返回一个O赢的+10,一个X赢的-10和一个平局的0:
public int score(){
if(isGameOver2()=='X')
return -10;
else if(isGameOver2()=='O')
return +10;
else
return 0;
}
种子制定者:
//Sets an X at a certain location and updates the turn, countX and lastAdded variables
public void setX(int i){
state[i]='X';
DDState[RowCol.get(i)[0]][RowCol.get(i)[1]]='X';
turn='O';
countX++;
lastAdded=i;
}
//Sets an O at a certain location and updates the turn, countO and lastAdded variables
public void setO(int i){
state[i]='O';
DDState[RowCol.get(i)[0]][RowCol.get(i)[1]]='O';
turn='X';
countO++;
lastAdded=i;
}
还原,只是恢复移动。 例如,如果一个X被放置在位置0,则返回(0)在它的位置设置' - '并更新由setX改变的变量:
public void revert(int i){
state[i]='-';
DDState[RowCol.get(i)[0]][RowCol.get(i)[1]]='-';
if(turn=='X'){
turn = 'O';
countO--;
}
else {
turn = 'X';
countX--;
}
}
那么,这看起来像alpha beta修剪你,如果不是我怎么能做到这一点?
您已经在使用某种“简化”的Alpha-Beta:目前,只要玩家找到胜利的位置,就会进行修剪。
一个合适的AB会传递一个Alpha值和一个Beta值,以确定玩家将达到的最小值和最大值。 在那里,每当得分比对手当前的“最糟糕的情况”更糟糕或者相等时,你就会修剪。
在你的情况下,你不仅可以修剪得分(就像你现在所做的那样),而且在一些0的分数上也是如此。
链接地址: http://www.djcxy.com/p/56315.html上一篇: Implementing alpha beta pruning in a TicTacToe minimax algorithm
下一篇: Alpha Beta prune for chess always returning the first move in the list