从内存角度看python中对数组的操作

Question

我想了解以下操作中的内存分配。

x_batch,images_path,ImageValidStatus = tf_resize_images(path_list, img_type=col_mode, im_size=IMAGE_SIZE)
x_batch=x_batch/255;
x_batch = 1.0-x_batch  
x_batch = x_batch.reshape(x_batch.shape[0],IMAGE_SIZE[0]*IMAGE_SIZE[1]*IMAGE_SIZE[2])

我感兴趣的是... x_batch这是个多目标numpy数组。(100x64x64x3)其中100是图像的数量，64x64x3是图像的尺寸。

在一个时间点上，位于内存中的图像的最大拷贝数是多少.换句话说，到底如何操作？(x_batch/255) , (1-x_batch) 和 x_batch.reshape 从内存的角度来看。

我主要担心的是，在某些情况下，我试图在同一时间处理500K的图像，如果我将这些图像复制到内存中，将很难在内存中容纳所有的图像。

Answer 1

我在你的代码中看到 "tf"，所以我不确定你问的是tensors还是array。让我们假设你问的是数组。一般情况下，数组被写入内存一次，然后进行操作。比如说

import numpy as np
data = np.empty((1000,30,30,5))  #This took up 1000*30*30*5*dtype_size bytes (plus epsilon). 
data.reshape((1000,30,150))      #Does nothing but update how numpy accesses the array.
data += 1                        #Adds one to all the entries in the array. 
data = 1-data                    #Overwrites the array with the data of 1-data.
x    = data + 1                  #Re-allocates and copies the whole memory.

只要你不改变数组的大小（重新分配内存），那么numpy对数据的操作就会非常快速高效。虽然没有tensorflow好，但是非常非常快。在不使用更多内存的情况下，原地添加，函数，操作，都可以完成。像对数组进行追加可能会造成问题，让python重写内存中的数组。

从内存角度看python中对数组的操作

问题描述投票：0回答：1

1个回答

最新问题

从内存角度看python中对数组的操作

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1