Minimax 算法无法正确运行

问题描述 投票:0回答:1

我目前正在尝试将 Minimax 算法应用到 Tic-Tac-Toe 游戏中。到目前为止,算法并没有按照我需要的方式运行,因为人工智能没有选择游戏中的最佳动作。

import math


board = [[0, 0, 0],[0, 0, 0],[0, 0, 0]]
def displayTable():
    for i in range(30):
        print('-', end = '')
        if i == 29:
            print('-')
    for i in range(4):
        if i == 3:
            print('|')
            break
        if board[0][i] == 0:
            print('|         ', end = '')
        else:
            print('|   ', board[0][i], '   ', end = '')
    for i in range(30):
        print('-', end = '')
        if i == 29:
            print('-')
    for i in range(4):
        if i == 3:
            print('|')
            break
        if board[1][i] == 0:
            print('|         ', end = '')
        else:
            print('|   ', board[1][i], '   ', end = '')
    for i in range(30):
        print('-', end = '')
        if i == 29:
            print('-')
    for i in range(4):
        if i == 3:
            print('|')
            break
        if board[2][i] == 0:
            print('|         ', end = '')
        else:
            print('|   ', board[2][i], '   ', end = '')
    for i in range(30):
        print('-', end = '')
        if i == 29:
            print('-')

def getUserMove():
    print("Enter the x, y coordinates of the space you want to use (Hint: coordinates are 0 indexed):")
    usermove = input("Separate your coordinates with a comma, no spaces: ")
    userArray = usermove.split(',')
    while True:
        if int(userArray[0]) > 2 or int(userArray[1]) > 2:
            print('That spot is out of range! Try to pick again!')
            usermove = input("Separate your coordinates with a comma, no spaces: ")
            userArray = usermove.split(',')
        elif board[int(userArray[0])][int(userArray[1])] != 0:
            print('That spot is already taken! Try to pick again!')
            usermove = input("Separate your coordinates with a comma, no spaces: ")
            userArray = usermove.split(',')
        else:
            board[int(userArray[0])][int(userArray[1])] = 'X'
            break

def terminalState(gameboard):
    for i in range(3):
        if gameboard[i][0] == 'X' and gameboard[i][1] == 'X' and gameboard[i][2] == 'X':
            return 1
        if gameboard[i][0] == 'O' and gameboard[i][1] == 'O' and gameboard[i][2] == 'O':
            return -1
        if gameboard[0][i] == 'X' and gameboard[1][i] == 'X' and gameboard[2][i] == 'X':
            return 1
        if gameboard[0][i] == 'O' and gameboard[1][i] == 'O' and gameboard[2][i] == 'O':
            return -1
    
    if gameboard[0][0] == 'X' and gameboard[1][1] == 'X' and gameboard[2][2] == 'X':
        return 1
    if gameboard[0][0] == 'O' and gameboard[1][1] == 'O' and gameboard[2][2] == 'O':
        return -1
    if gameboard[2][0] == 'X' and gameboard[1][1] == 'X' and gameboard[0][2] == 'X':
        return 1
    if gameboard[2][0] == 'O' and gameboard[1][1] == 'O' and gameboard[0][2] == 'O':
        return -1
    
    return 0



def aiMove(board):
    aiscore = -math.inf
    aimovex = -1
    aimovey = -1
    
    for i in range(3):
        for j in range(3):
            if board[i][j] == 0:
                board[i][j] = 'O'
                newscore = minimax(board, 0, False)
                board[i][j] = 0
                if newscore > aiscore:
                    aiscore = newscore
                    aimovex = i
                    aimovey = j
    board[aimovex][aimovey] = 'O'

def maxValue(board, depth):
    m = -math.inf
    for i in range(3):
        for j in range(3):
            if board[i][j] == 0:
                board[i][j] = 'O'
                score = minimax(board, depth + 1, False)
                board[i][j] = 0
                m = max(m, score)
    return m

def minValue(board, depth):
    m = math.inf
    for i in range(3):
        for j in range(3):
            if board[i][j] == 0:
                board[i][j] = 'X'
                score = minimax(board, depth + 1, True)
                board[i][j] = 0
                m = min(m, score)
    return m

def minimax(board, curDepth, isMaximizing):
    if terminalState(board) != 0:
        return terminalState(board)
    if isMaximizing:
        return maxValue(board, curDepth)
    else:
        return minValue(board, curDepth)

displayTable()
while terminalState(board) == 0:
    getUserMove()
    if terminalState(board) != 0:
        displayTable()
        break
    aiMove(board)
    displayTable()
if terminalState(board) == 1:
    print('Congratulations! You won!')
elif terminalState(board) == -1:
    print('You tried your best. Thank you for playing')
else:
    print('Tie game. Thank you for playing')

这是我正在尝试制作的游戏代码。除了 Minimax 算法的逻辑错误之外,一切运行都没有错误。我尝试更改terminalState和aiMove函数,但我一直得到相同的结果。

python algorithm debugging artificial-intelligence minimax
1个回答
0
投票

主要问题是您没有处理平局的最终状态,这会导致在

minimax
中返回无穷大的分数(因为没有找到有效的移动)。

请注意,您的

terminalState
只能返回三个值-1、0和1,但有四种可能的状态:X胜、O胜、平局和尚未终止!在您调用
terminalState
的地方,您认为值 0 表示游戏尚未结束,但在主代码中,您有输出“平局游戏”的代码,但确实不可能到达那里:如果状态为 0,主循环不会结束。

其次,由于

terminalState
返回的值被
minimax
用作分数,因此它应该有一个与最大化原则相对应的符号。在代码中的所有地方,您都选择“O”作为最大化玩家,除了在
terminalState
中,当“O”获胜时返回 -1,然后
minimax
使用它作为分数,这显然是错误的。应该是相反的。

要解决这两个问题,请切换

terminalState
中返回值的符号,并在其中添加代码以检测平局。区分平局和未结束的游戏:平局时返回 0,游戏尚未结束时返回
None

所以:

def terminalState(gameboard):
    for i in range(3):
        if gameboard[i][0] == 'X' and gameboard[i][1] == 'X' and gameboard[i][2] == 'X':
            return -1  # Altered sign, here and below
        if gameboard[i][0] == 'O' and gameboard[i][1] == 'O' and gameboard[i][2] == 'O':
            return 1
        if gameboard[0][i] == 'X' and gameboard[1][i] == 'X' and gameboard[2][i] == 'X':
            return -1
        if gameboard[0][i] == 'O' and gameboard[1][i] == 'O' and gameboard[2][i] == 'O':
            return 1

    if gameboard[0][0] == 'X' and gameboard[1][1] == 'X' and gameboard[2][2] == 'X':
        return -1
    if gameboard[0][0] == 'O' and gameboard[1][1] == 'O' and gameboard[2][2] == 'O':
        return 1
    if gameboard[2][0] == 'X' and gameboard[1][1] == 'X' and gameboard[0][2] == 'X':
        return -1
    if gameboard[2][0] == 'O' and gameboard[1][1] == 'O' and gameboard[0][2] == 'O':
        return 1

    if any(0 in row for row in gameboard):  # Is there a free cell?
        return None  # not terminal!

    return 0  # It's a tie

然后在三个地方将

terminalState
返回的值与 0 进行比较。更改这些位置以便与
None
进行比较。

通过这些更改,您的代码将可以工作。

© www.soinside.com 2019 - 2024. All rights reserved.