我对 keras rl2 DQAgent 模型有问题,它出于某种原因给我的状态增加了另一个暗淡,我得到了值错误

问题描述 投票:0回答:0

在最后一天,我试图处理我在 DQNAGENT 拟合函数中遇到的错误。 我收到以下错误:

ValueError: Error when checking input: expected dense_input to have 2 dimensions, but got array with shape (1, 3, 4)

在 dqnagent.fit 函数中。我尝试使用自定义的 flappy bird env 来训练 DQNAgent 和我的自定义状态。 我得到它

DQN_flappy\venv\lib\site-packages\rl\core.py", line 168, in fit
    action = self.forward(observation)

如果这对这里的人有意义。 看起来他们只是出于某种我不知道为什么的原因在他们的代码中添加了另一个维度,但我想我在这里的某个地方出了问题。 这是环境代码(Game() 是游戏本身,我手动检查并用它做了一个 NEAT 项目所以我几乎可以肯定这不是问题):

class flappy_env(Env):

    def __init__(self):
        self.game = Game()
        self.observation_space = Box(low=np.array([-0.4, -2.0, -1.0, -1.0], dtype=np.float32),
                                     high=np.array([1.0, 2.0, 1.0, 0.5], dtype=np.float32))
        self.action_space = Discrete(2)

    def step(self, action):
        done, score, reward = self.game.play_step(action)
        state = self.game.get_state()
        info = {}
        return state, reward, done, info

    def render(self):
        pass

    def reset(self):
        self.game.reset_game()
        return self.game.get_state()

和 game.get_state():

def get_state(self):
    if len(self.pipe_group) > 0:
       bird_y_loc = self.flappy.rect.y
       x_dist_pipe_bird = self.pipe_group.sprites()[0].rect.left - self.flappy.rect.right
       bot_pipe_y_loc = self.pipe_group.sprites()[0].rect.top - bird_y_loc
       top_pipe_y_loc = self.pipe_group.sprites()[1].rect.bottom - bird_y_loc
       return np.array([x_dist_pipe_bird / 500, 10 * bot_pipe_y_loc / screen_height,
                        5 * top_pipe_y_loc / screen_height, self.flappy.vel / 35], dtype=np.float32)
    # shouldn't get here
    return None

这里是模型代码以防有帮助:

<pre><code>
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import Adam
from rl.agents import DQNAgent
from rl.memory import SequentialMemory
from rl.policy import LinearAnnealedPolicy, EpsGreedyQPolicy
from env import flappy_env


def build_model():
    model = Sequential()
    model.add(Dense(16, input_shape=(4,), activation='relu'))
    model.add(Dense(16, activation='relu'))
    model.add(Dense(2, activation='linear'))
    return model


def build_agent(sequential_model):
        policy = LinearAnnealedPolicy(EpsGreedyQPolicy(), attr='eps', value_max=1., value_min=.1, value_test=.2, nb_steps=10000)
    memory = SequentialMemory(limit=1000, window_length=3)
    dqn = DQNAgent(model=sequential_model, memory=memory, policy=policy,
                   enable_dueling_network=True, dueling_type='avg',
                   nb_actions=2, nb_steps_warmup=1000)
    return dqn


env = flappy_env()
model = build_model()
model.summary()
dqn = build_agent(model)
dqn.compile(Adam(learning_rate=1e-4))
dqn.fit(env, nb_steps=10000, visualize=False)

我调试了这么多小时,除了 dqn.py(属于 keras.rl2)中的一条奇怪的行外找不到任何东西

q_values = self.compute_batch_q_values([state]).flatten()
如您所见,它添加了一个新的维度。我还看到 (1,3,4) 形状中的 3 是我在代理内存中的 window_length。 我正在尝试根据需要添加尽可能多的内容,因此也添加了整个控制台:

<pre><code>
Training for 10000 steps ...
Interval 1 (0 steps performed)
Traceback (most recent call last):
  File "C:\Users\kfir\PycharmProjects\DQN_flappy\model.py", line 33, in <module>
    dqn.fit(env, nb_steps=10000, visualize=False)
  File "C:\Users\kfir\PycharmProjects\DQN_flappy\venv\lib\site-packages\rl\core.py", line 168, in fit
    action = self.forward(observation)
  File "C:\Users\kfir\PycharmProjects\DQN_flappy\venv\lib\site-packages\rl\agents\dqn.py", line 224, in forward
    q_values = self.compute_q_values(state)
  File "C:\Users\kfir\PycharmProjects\DQN_flappy\venv\lib\site-packages\rl\agents\dqn.py", line 68, in compute_q_values
    q_values = self.compute_batch_q_values([state]).flatten()
  File "C:\Users\kfir\PycharmProjects\DQN_flappy\venv\lib\site-packages\rl\agents\dqn.py", line 63, in compute_batch_q_values
    q_values = self.model.predict_on_batch(batch)
  File "C:\Users\kfir\PycharmProjects\DQN_flappy\venv\lib\site-packages\keras\engine\training_v1.py", line 1305, in predict_on_batch
    inputs, _, _ = self._standardize_user_data(
  File "C:\Users\kfir\PycharmProjects\DQN_flappy\venv\lib\site-packages\keras\engine\training_v1.py", line 2652, in _standardize_user_data
    return self._standardize_tensors(
  File "C:\Users\kfir\PycharmProjects\DQN_flappy\venv\lib\site-packages\keras\engine\training_v1.py", line 2693, in _standardize_tensors
    x = training_utils_v1.standardize_input_data(
  File "C:\Users\kfir\PycharmProjects\DQN_flappy\venv\lib\site-packages\keras\engine\training_utils_v1.py", line 712, in standardize_input_data
    raise ValueError(
ValueError: Error when checking input: expected dense_input to have 2 dimensions, but got array with shape (1, 3, 4)

顺便说一句,如果需要更多文件但我没有提供,我会添加。 请帮我找到它我尝试了我能想到的一切。非常感谢所有试图帮助/阅读问题的人!

我还以为不会加什么奇怪的维度

reinforcement-learning dqn keras-rl
© www.soinside.com 2019 - 2024. All rights reserved.