我目前正在尝试将 Minimax 算法应用到 Tic-Tac-Toe 游戏中。到目前为止,算法并没有按照我需要的方式运行,因为人工智能没有选择游戏中的最佳动作。
import math
board = [[0, 0, 0],[0, 0, 0],[0, 0, 0]]
def displayTable():
for i in range(30):
print('-', end = '')
if i == 29:
print('-')
for i in range(4):
if i == 3:
print('|')
break
if board[0][i] == 0:
print('| ', end = '')
else:
print('| ', board[0][i], ' ', end = '')
for i in range(30):
print('-', end = '')
if i == 29:
print('-')
for i in range(4):
if i == 3:
print('|')
break
if board[1][i] == 0:
print('| ', end = '')
else:
print('| ', board[1][i], ' ', end = '')
for i in range(30):
print('-', end = '')
if i == 29:
print('-')
for i in range(4):
if i == 3:
print('|')
break
if board[2][i] == 0:
print('| ', end = '')
else:
print('| ', board[2][i], ' ', end = '')
for i in range(30):
print('-', end = '')
if i == 29:
print('-')
def getUserMove():
print("Enter the x, y coordinates of the space you want to use (Hint: coordinates are 0 indexed):")
usermove = input("Separate your coordinates with a comma, no spaces: ")
userArray = usermove.split(',')
while True:
if int(userArray[0]) > 2 or int(userArray[1]) > 2:
print('That spot is out of range! Try to pick again!')
usermove = input("Separate your coordinates with a comma, no spaces: ")
userArray = usermove.split(',')
elif board[int(userArray[0])][int(userArray[1])] != 0:
print('That spot is already taken! Try to pick again!')
usermove = input("Separate your coordinates with a comma, no spaces: ")
userArray = usermove.split(',')
else:
board[int(userArray[0])][int(userArray[1])] = 'X'
break
def terminalState(gameboard):
for i in range(3):
if gameboard[i][0] == 'X' and gameboard[i][1] == 'X' and gameboard[i][2] == 'X':
return 1
if gameboard[i][0] == 'O' and gameboard[i][1] == 'O' and gameboard[i][2] == 'O':
return -1
if gameboard[0][i] == 'X' and gameboard[1][i] == 'X' and gameboard[2][i] == 'X':
return 1
if gameboard[0][i] == 'O' and gameboard[1][i] == 'O' and gameboard[2][i] == 'O':
return -1
if gameboard[0][0] == 'X' and gameboard[1][1] == 'X' and gameboard[2][2] == 'X':
return 1
if gameboard[0][0] == 'O' and gameboard[1][1] == 'O' and gameboard[2][2] == 'O':
return -1
if gameboard[2][0] == 'X' and gameboard[1][1] == 'X' and gameboard[0][2] == 'X':
return 1
if gameboard[2][0] == 'O' and gameboard[1][1] == 'O' and gameboard[0][2] == 'O':
return -1
return 0
def aiMove(board):
aiscore = -math.inf
aimovex = -1
aimovey = -1
for i in range(3):
for j in range(3):
if board[i][j] == 0:
board[i][j] = 'O'
newscore = minimax(board, 0, False)
board[i][j] = 0
if newscore > aiscore:
aiscore = newscore
aimovex = i
aimovey = j
board[aimovex][aimovey] = 'O'
def maxValue(board, depth):
m = -math.inf
for i in range(3):
for j in range(3):
if board[i][j] == 0:
board[i][j] = 'O'
score = minimax(board, depth + 1, False)
board[i][j] = 0
m = max(m, score)
return m
def minValue(board, depth):
m = math.inf
for i in range(3):
for j in range(3):
if board[i][j] == 0:
board[i][j] = 'X'
score = minimax(board, depth + 1, True)
board[i][j] = 0
m = min(m, score)
return m
def minimax(board, curDepth, isMaximizing):
if terminalState(board) != 0:
return terminalState(board)
if isMaximizing:
return maxValue(board, curDepth)
else:
return minValue(board, curDepth)
displayTable()
while terminalState(board) == 0:
getUserMove()
if terminalState(board) != 0:
displayTable()
break
aiMove(board)
displayTable()
if terminalState(board) == 1:
print('Congratulations! You won!')
elif terminalState(board) == -1:
print('You tried your best. Thank you for playing')
else:
print('Tie game. Thank you for playing')
这是我正在尝试制作的游戏代码。除了 Minimax 算法的逻辑错误之外,一切运行都没有错误。我尝试更改terminalState和aiMove函数,但我一直得到相同的结果。
主要问题是您没有处理平局的最终状态,这会导致在
minimax
中返回无穷大的分数(因为没有找到有效的移动)。
请注意,您的
terminalState
只能返回三个值-1、0和1,但有四种可能的状态:X胜、O胜、平局和尚未终止!在您调用 terminalState
的地方,您认为值 0 表示游戏尚未结束,但在主代码中,您有输出“平局游戏”的代码,但确实不可能到达那里:如果状态为 0,主循环不会结束。
其次,由于
terminalState
返回的值被minimax
用作分数,因此它应该有一个与最大化原则相对应的符号。在代码中的所有地方,您都选择“O”作为最大化玩家,除了在 terminalState
中,当“O”获胜时返回 -1,然后 minimax
使用它作为分数,这显然是错误的。应该是相反的。
要解决这两个问题,请切换
terminalState
中返回值的符号,并在其中添加代码以检测平局。区分平局和未结束的游戏:平局时返回 0,游戏尚未结束时返回 None
。
所以:
def terminalState(gameboard):
for i in range(3):
if gameboard[i][0] == 'X' and gameboard[i][1] == 'X' and gameboard[i][2] == 'X':
return -1 # Altered sign, here and below
if gameboard[i][0] == 'O' and gameboard[i][1] == 'O' and gameboard[i][2] == 'O':
return 1
if gameboard[0][i] == 'X' and gameboard[1][i] == 'X' and gameboard[2][i] == 'X':
return -1
if gameboard[0][i] == 'O' and gameboard[1][i] == 'O' and gameboard[2][i] == 'O':
return 1
if gameboard[0][0] == 'X' and gameboard[1][1] == 'X' and gameboard[2][2] == 'X':
return -1
if gameboard[0][0] == 'O' and gameboard[1][1] == 'O' and gameboard[2][2] == 'O':
return 1
if gameboard[2][0] == 'X' and gameboard[1][1] == 'X' and gameboard[0][2] == 'X':
return -1
if gameboard[2][0] == 'O' and gameboard[1][1] == 'O' and gameboard[0][2] == 'O':
return 1
if any(0 in row for row in gameboard): # Is there a free cell?
return None # not terminal!
return 0 # It's a tie
然后在三个地方将
terminalState
返回的值与 0 进行比较。更改这些位置以便与 None
进行比较。
通过这些更改,您的代码将可以工作。