经典强化学习Cartpole问题的代码抛出错误: ValueError:使用以下代码时要解压的值太多(预期为 4):
```
# Take action and observe next state and reward
next_state, reward, done, info = env.step(action)
Kindly assist in resolving this issue.
我假设你正在使用健身房。
env.step 返回五个值,而不是四个。 https://gymnasium.farama.org/api/env/#gymnasium.Env.step
next_state, reward, truncated, terminated, info = env.step(action)