TRPO-RL：我需要获得一个8自由度的机械手才能移动到指定点。我需要在凉亭环境中使用OpenAI Gym来实现TRPO RL代码吗？ - reinforcement-learning - SO中文参考