用于强化学习的 OpenAI Gym Mario 模型中的值太多/不足

问题描述 投票:0回答:1

使用 OpenAI Gym 的强化学习能够为玩《超级马里奥兄弟》制作强化模型。我尝试按照 Nicholas Renotte 的 youtube 教程进行此操作,但大约 10 分钟后我收到错误消息“解压的值太多(预期为 4)或”不足够的值来解压(预期 5,得到 4)。”

错误来自循环中返回的4个参数,但我认为它起源于实例化“env”的地方。

来自 Jupyter 笔记本:

#!pip install gym_super_mario_bros==7.3.0 nes_py 
import gym_super_mario_bros #import game

from nes_py.wrappers import JoypadSpace #import wrapper

from gym_super_mario_bros.actions import SIMPLE_MOVEMENT #import basic movements

# Initialize the game

env = gym_super_mario_bros.make('SuperMarioBros-v0', apply_api_compatibility=True, render_mode="human")

#env = gym_super_mario_bros.make('SuperMarioBros-v0')

#make calls the type of environment.you can find more environmnets on the gym website. 

print(env.action_space) #this shows there are 256 actions (complex)

env = JoypadSpace(env, SIMPLE_MOVEMENT) 
#this wraps the environmnet with the simple movement inputs into one object

print(env.action_space) #This shows there are 7 available actions (simplified)

print(env.observation_space.shape)

print(env.observation_space)

print((env.action_space.sample()))

done = True # Create a flag when finished to know when to restart

for step in range(100000): # Loop through each frame in the game

    if done: 

        # Start the gamee

        env.reset()

    state, reward, done, info = env.step(env.action_space.sample())
 # Do random actions

    # Show the game on the screen

    env.render()
# Close the game
env.close()
python jupyter reinforcement-learning openai-gym
1个回答
0
投票

问题出在这一行:

state, reward, done, info = env.step(env.action_space.sample())
。您正在尝试使用 4 个变量而不是 5 个变量来解压缩 env.step。看看 step 函数的文档here

换成这个:

state, reward, done, truncated , info = env.step(env.action_space.sample()
© www.soinside.com 2019 - 2024. All rights reserved.