来自OpenAI基线的LazyFrames如何节省内存?

问题描述 投票:1回答:1

OpenAI的基线使用以下代码返回LazyFrames而不是串联的numpy数组以节省内存。这样做的好处是可以将numpy数组同时保存在不同的列表中,因为列表仅保存引用而不是对象本身。但是,在LazyFrames的实现中,它进一步将级联的numpy数组保存在self._out中,在这种情况下,如果每个LazyFrames对象至少被调用一次,它将始终在其中保存一个级联的numpy数组,似乎根本不保存任何内存。那么LazeFrames的意义是什么?还是我误会了什么?

class FrameStack(gym.Wrapper):
    def __init__(self, env, k):
        """Stack k last frames.

        Returns lazy array, which is much more memory efficient.

        See Also
        --------
        baselines.common.atari_wrappers.LazyFrames
        """
        gym.Wrapper.__init__(self, env)
        self.k = k
        self.frames = deque([], maxlen=k)
        shp = env.observation_space.shape
        self.observation_space = spaces.Box(low=0, high=255, shape=(shp[:-1] + (shp[-1] * k,)), dtype=env.observation_space.dtype)

    def reset(self):
        ob = self.env.reset()
        for _ in range(self.k):
            self.frames.append(ob)
        return self._get_ob()

    def step(self, action):
        ob, reward, done, info = self.env.step(action)
        self.frames.append(ob)
        return self._get_ob(), reward, done, info

    def _get_ob(self):
        assert len(self.frames) == self.k
        return LazyFrames(list(self.frames))

class LazyFrames(object):
    def __init__(self, frames):
        """This object ensures that common frames between the observations are only stored once.
        It exists purely to optimize memory usage which can be huge for DQN's 1M frames replay
        buffers.

        This object should only be converted to numpy array before being passed to the model.

        You'd not believe how complex the previous solution was."""
        self._frames = frames
        self._out = None

    def _force(self):
        if self._out is None:
            self._out = np.concatenate(self._frames, axis=-1)
            self._frames = None
        return self._out

    def __array__(self, dtype=None):
        out = self._force()
        if dtype is not None:
            out = out.astype(dtype)
        return out

    def __len__(self):
        return len(self._force())

    def __getitem__(self, i):
        return self._force()[i]

    def count(self):
        frames = self._force()
        return frames.shape[frames.ndim - 1]

    def frame(self, i):
        return self._force()[..., I]
python numpy openai-gym stable-baselines
1个回答
0
投票

我实际上是来这里了解这是如何节省所有内存的!但您提到列表存储了对基础数据的引用,而numpy数组存储了该数据的副本,我认为您是正确的。

要回答您的问题,您是对的!调用_force时,它将用一个numpy数组填充self._out项目,从而扩展内存。但是until调用_force(在LazyFrame的任何API函数中调用),self._outNone。因此,我们的想法是在需要基础数据之前不要调用_force(因此,不要调用任何LazyFrames方法),因此,其文档字符串中的警告“此对象仅应转换为numpy”数组传递给模型之前。”

请注意,当self._out被数组填充时,它还会清除self._frames,因此它不会存储重复的信息(从而损害了仅存储所需数量的整个目的)。

而且,在同一文件中,您会找到带有此文档字符串的ScaledFloatFrame

    Scales the observations by 255 after converting to float.
    This will undo the memory optimization of LazyFrames,
    so don't use it with huge replay buffers.
© www.soinside.com 2019 - 2024. All rights reserved.