我在使用 TensorFlow 实现的 Q-learning 模型中遇到了训练速度变慢的问题。我简化了我的代码,以专注于训练循环并在每集之后保存模型。问题是在后续的训练中训练速度明显下降。
我正在使用具有卷积神经网络 (CNN) 架构的 Q 学习代理。模型在每集结束时保存,我继续从模型进行训练,而不在下一集加载保存的模型。
这是相关代码的精简版本:
import numpy as np
import tensorflow as tf
MAX_EPISODES = 50
CONTINUE = True
class QLearningAgent:
def __init__(self, state_size, action_size):
self.state_size = state_size
self.action_size = action_size
self.epsilon = 0.9
self.epsilon_decay = 0.995
self.epsilon_min = 0.1
self.learning_rate = 0.01
self.gamma = 0.95
self.model = self.build_model()
def build_model(self):
# Your model architecture here
# ...
def act(self, state):
# Epsilon-greedy action selection
# ...
def train(self, state, action, reward, next_state, done):
# Q-learning training logic
# ...
def save_model(self, filename):
self.model.save(filename)
def update_epsilon(self):
self.epsilon = max(self.epsilon * self.epsilon_decay, self.epsilon_min)
env = BallEnvironment(max_steps=1000)`
for episode in range(MAX_EPISODES):
obs = env.reset()
while True:
env.render()
left_action = env.left_ball.q_agent.act(np.reshape(obs, [1, *env.state_size]))
next_obs, rewards, done, _ = env.step(left_action, right_action)
left_state = np.reshape(obs, [1, *env.state_size])
left_next_state = np.reshape(next_obs, [1, *env.state_size])
env.left_ball.q_agent.train(left_state, left_action, rewards[0], left_next_state, done)
obs = next_obs
if done:
env.left_ball.q_agent.save_model("left_trained_agent.h5")
break
env.close()
答案很简单。保存模型后您需要添加的只是添加
tf.keras.backend.clear_session()
“如果您在循环中创建许多模型,则此全局状态将随着时间的推移消耗越来越多的内存,您可能需要清除它。 调用clear_session() 会释放全局状态:这有助于避免旧模型和层造成的混乱,尤其是在内存有限的情况下。”