TicTacToe 使用极小极大和 alpha-beta 剪枝无法做出正确的决定

问题描述 投票:0回答:1

我正在制作一个带有极小极大和 alpha beta 修剪的井字游戏机器人,但它没有达到标准(实际上有点糟糕)。当我尝试玩它时,机器人错过了最明显的动作并失败了。

示例游戏(机器人启动):机器人左上角 => 玩家中上 => 机器人右上角 => 玩家中间 => 机器人中左 => 玩家通过中下获胜

如您所见,机器人错过了最明显的区块。

这是我的代码:

import random, copy, math

COMPUTER_MARK, PLAYER_MARK = "X", "O"
CMP_ALWAYS_STARTS = False
cmp_turn = True
board = [0] * 9
table = {0: " ", 1: COMPUTER_MARK, -1: PLAYER_MARK}

def print_board():
    bd, tb = board, table
    print(f" {tb[bd[0]]} | {tb[bd[1]]} | {tb[bd[2]]}\n-----------\n {tb[bd[3]]} | {tb[bd[4]]} | {tb[bd[5]]}\n-----------\n {tb[bd[6]]} | {tb[bd[7]]} | {tb[bd[8]]}")

def get_legal_moves(board):
    return [i for i, val in enumerate(board) if val == 0]

def win(bd):
    if bd[0] == bd[1] == bd[2] != 0: return bd[0]
    if bd[3] == bd[4] == bd[5] != 0: return bd[3]
    if bd[6] == bd[7] == bd[8] != 0: return bd[6]
    if bd[0] == bd[3] == bd[6] != 0: return bd[0]
    if bd[1] == bd[4] == bd[7] != 0: return bd[1]
    if bd[2] == bd[5] == bd[8] != 0: return bd[2]
    if bd[0] == bd[4] == bd[8] != 0: return bd[0]
    if bd[2] == bd[4] == bd[6] != 0: return bd[2]
    return None

def draw(board):
    return 0 not in board

def player_move():
    global board
    player_move, legal = None, get_legal_moves(board)
    while player_move not in legal:
        player_move = int(input("Make a move (1-9): ")) - 1
    board[player_move] = -1

def computer_move():
    global board
    best_score = -math.inf
    best_move = 42
    for space in get_legal_moves(board):
        board_copy = copy.deepcopy(board); board_copy[space] = 1
        score = minimax_ab(board_copy, -math.inf, math.inf, True)
        if score > best_score:
            best_score = score
            best_move = space
    board[best_move] = 1

def minimax_ab(board, alpha, beta, maximizing_player): # minimax search algorithm with alpha-beta pruning
    if winner := win(board): return winner
    if draw(board): return 0

    if maximizing_player:
        max_eval = -math.inf
        for child in get_legal_moves(board):
            board_copy = copy.deepcopy(board); board_copy[child] = 1
            evaluation = minimax_ab(board_copy, alpha, beta, False)
            max_eval = max(max_eval, evaluation)
            alpha = max(alpha, evaluation)
            if beta <= alpha: break
        return max_eval
    else:
        min_eval = math.inf
        for child in get_legal_moves(board):
            board_copy = copy.deepcopy(board); board_copy[child] = -1
            evaluation = minimax_ab(board_copy, alpha, beta, True)
            min_eval = min(min_eval, evaluation)
            beta = min(beta, evaluation)
            if beta <= alpha: break
        return min_eval

if not CMP_ALWAYS_STARTS and random.randint(0, 1):
    cmp_turn = False
    print_board()

while not (win(board) or draw(board)):
    if cmp_turn:
        print("Computer's move.")
        computer_move()
        print_board()
    else:
        player_move()
        print_board()
    cmp_turn = not cmp_turn

match win(board):
    case -1: print("You win!")
    case 1: print("Computer wins!")
    case None: print("Tie.")

我还尝试将 eval 函数(当我根据哪个玩家获胜返回 -1 或 1 时)更改为: return Winner * len(get_legal_moves(board))

python-3.x bots tic-tac-toe minimax alpha-beta-pruning
1个回答
0
投票

当计算机第一次启动时,计算机玩家会在极小极大搜索中连续两次。打印出棋盘、分数和潜在的动作,你应该能够弄清楚这一点。从近端游戏状态开始调试。您应该能够看到问题的形成。

© www.soinside.com 2019 - 2024. All rights reserved.