我有以下案例:
我的代理应该在每个步骤中执行以下三个操作之一:
我目前有这个代码:
from gym.spaces import Box, Discrete
...
# Define action space
low = [0.0, 0.0] # lower bounds for amount and price
high = [float('inf'), float('inf')] # upper bounds for amount and price
self.action_space_limit = Box(low=np.array(low), high=np.array(high), dtype=np.float32, shape=(2,))
self.action_space_market = Box(low=np.array([0.0]), high=np.array([float('inf')]), dtype=np.float32, shape=(1,))
self.action_space_no_order = Discrete(1)
self.action_space = gym.spaces.Tuple(
[self.action_space_limit, self.action_space_market, self.action_space_no_order])
但我担心代理会为所有三种可能的操作产生输出。我希望代理只选择其中一个。怎么做?