Negamax国际象棋算法：如何使用最终回报？

2018-06-30 08:21:50

我为棋类游戏制定了negamax算法，我想知道如何使用最终的棋盘值。我知道negamax算法的最终回报代表了玩家采取他最好的举动之后棋盘的价值，但这并非完全有用的信息。我需要知道那个举措是什么，而不是它的价值。

代码如下：

public int negamax(Match match, int depth, int alpha, int beta, int color) {
    if(depth == 0) {
        return color*stateScore(match);
    }

    ArrayList<Match> matches = getChildren(match, color);

    if(matches.size() == 0) {
        return color*stateScore(match);
    }

    int bestValue = Integer.MIN_VALUE;

    for(int i = 0; i != matches.size(); i++) {
        int value = -negamax(matches.get(i), depth-1, -beta, -alpha, -color);

        if(value > bestValue) {
            bestValue = value;
        }

        if(value > alpha) {
            alpha = value;
        }

        if(alpha >= beta) {
            break;
        }
    }

    return bestValue;
}

public void getBestMove(Match match, int color) {

    int bestValue = negamax(match, 4, Integer.MIN_VALUE, Integer.MAX_VALUE, color);

    // What to do with bestValue???

}

我想在bestValue确定后重新评估当前比赛状态的孩子。然后我遍历它们，并找出哪些孩子的状态分数等于bestValue。但是这是行不通的，因为无论如何，它们中的很多都会有相同的stateScore，这是他们可以导致哪些问题的原因......

我可以看到你在做qsearch和alpha-beta。你的算法是众所周知的，但你缺少一个关键部分。

让我勾勒一下国际象棋搜索的基本算法，它甚至适用于干鱼（世界上最强的引擎）。

search(Position p) {

    if (leaf node)
        qsearch(p)

    if (need to do move reduction)
        do_move_reduction_and_cut_off(p)

    moves = generate_moves(p)

    for_each(move in moves) {            
        p.move(move)
        v = -search(p, -beta, -alpha)
        p.undo(move)

        store the score and move into a hash table

        if (v > beta)
           cutoff break;           
    }

这只是一个非常简短的草图，但所有象棋算法都遵循它。比较你的版本，你注意到你没有做过p.move（move）和p.undo（move）吗？

基本上，传统方法会为给定位置生成一系列动作。循环移动，播放并撤消它并搜索它。如果你这样做，你确切知道哪一步会产生哪个分数。

还要注意存储移动并将分数存储到散列表中的行。如果您这样做，您可以轻松地从根节点重建整个主要变体。

我不知道你的Java类匹配究竟是什么，但无论如何，你的尝试是接近的，但不是完成搜索的经典方式。记住你需要在搜索算法中给出一个位置对象，但是你给它一个Match对象，这是错误的。

链接地址: http://www.djcxy.com/p/84677.html

上一篇: Negamax chess algorithm: How to use final return?

下一篇: The negamax algorithm..what's wrong?