我正在尝试在 gms2 中制作一个简单的 Q-learning AI,但是当我尝试更新 qTable 时,我会遇到同样的问题:
index out of bounds
项目很简单,AI可以去各个方向,需要触摸“obj_goal”同时避开“obj_obstacles”
创建活动:
QTableWidth = 100; // Ajuste o tamanho conforme necessário
QTableHeight = 100; // Ajuste o tamanho conforme necessário
QTable = ds_grid_create(QTableWidth, QTableHeight);
learningRate = 0.1;
discountFactor = 0.9;
explorationRate = 0.7;
reward = 0;
currentX = x;
currentY = y;
iniX = x;
iniY = y;
spd = 2;
步骤事件:
function getCurrentState() {
return string(currentX) + string("_") + string(currentY);
}
id_action = choose(0, 1, 2, 3);
currentX = x;
currentY = y;
currentState = getCurrentState();
#region randomizar ou andar com base em aprendizado
if random(1) < explorationRate {
// Exploração (ação aleatória)
action = choose("up", "down", "left", "right");
} else {
var bestActionIndex = 0;
var bestActionValue = QTable[# currentState, 0];
for (var i = 1; i < QTableHeight; i++) {
var value = QTable[# currentState, i];
if (value > bestActionValue) {
bestActionValue = value;
bestActionIndex = i;
}
}
action = bestActionIndex;
}
#endregion
#region açoes
switch (action) {
case "up":
y -= spd;
break;
case "down":
y += spd;
break;
case "left":
x -= spd;
break;
case "right":
x += spd;
break;
}
#endregion
if place_meeting(x, y, obj_objetivo) {
reward += 10;
x = iniX;
y = iniY;
} else if place_meeting(x, y, obj_Obstaculo) {
reward -= 10;
x = iniX;
y = iniY;
}
newState = getCurrentState();
var currentStateIndex = floor(currentState);
var idActionIndex = floor(id_action);
if currentState >= 0 && currentState < QTableWidth && id_action >= 0 && id_action < QTableHeight {
QTable[# currentState, id_action] += learningRate * (reward + discountFactor * QTable[# newState, id_action] - QTable[# currentState, id_action]);
} else {
show_debug_message("index out of bounds.");
}
currentX = x;
currentY = y;
应该更新 Qtable,让 AI 通过给予的奖励进行学习。
由于您似乎没有使用任何特定于网格的函数,因此您可以将其交换为二维数组 - 这将在越界访问时引发正确的错误。初始化必须像这样完成:
QTable = array_create(QTableWidth);
for (var i = 0; i < QTableWidth; i++) QTable[i] = array_create(QTableHeight, 0);