如何加快巨型 3D numpy 数组的嵌套 for 循环的处理速度？

Question

我创建了一个非常大的 3D numpy 数组，名为

tr_mat

。

tr_mat

的形状是：

tr_mat.shape
(1024, 536, 21073)

有关 3D numpy 数组的信息： 首先，在进入实际代码之前，我想澄清一下我正在尝试做什么。从

tr_mat.shape

可以看出，3D numpy 数组包含

1024 rows

和

536 columns

中的数值。也就是说，每个

矩阵中有 536 * 1024 =

值。

我的任务的概念背景： 3D numpy 数组中的每个

2D numpy 数组都包含图像中的灰度像素值。 3D numpy 数组

tr_mat

已经是转置，因为我想基于所有

矩阵中相同的像素位置构建一个时间序列。最后，我想将每个生成的

时间序列单独保存在

.1D

文本文件中。（因此，我最终会保存

.1D

文本文件。）

代码的相关部分：

tr_mat = frame_mat.transpose() # the tranposed 3D numpy array
# Save
rangeh = range(0, 1024)
for row, row_n_l in zip(tr_mat, rangeh): # row = pixel row of the 2D image
        for ts_pixel, row_n in zip(row, rangeh): # ts_pixel = the pixel time-series across the 3D array (across the single 2D arrays)
        # Save
        with open(f"/volumes/.../TS_Row{row_n_l}_Pixel{row_n}.1D", "w") as file:
                for i in ts_pixel: file.write(f"{i}\n") # Save each time-series value per row

问题：您能给我一些如何修改代码以加快速度的提示吗？我将

tqdm

包裹在第一个 for 循环周围，以检查嵌套循环的处理速度，大约需要 20 分钟才能到达 536 行中的约 120 行。另外，在我看来，随着迭代次数的增加，循环变得越来越慢。

可以在以下位置找到具有随机生成值的可重现代码：请仅更改输出目录

import numpy as np
tr_mat = np.random.random((1024, 536, 21073))

rangeh = range(0, 1024)
for row, row_n_l in zip(tr_mat, rangeh):
    for ts_pixel, row_n in zip(row, rangeh):
        # Save
        with open("/volumes/../TS_Row{row_n_l}_Pixel{row_n}.1D", "w") as file: # Please adjust the output directory
            for i in ts_pixel: file.write(f"{i}\n")

Answer 1

我同意 stelioslogothetis 的评论，重新考虑你在做什么。以不同的格式保存，文件较小，也许还涉及一些压缩。

如果您坚持自己的计划，此代码的速度会显着加快 (2-3 倍)：

save_path = f"/volumes/../TS_Row{row_n_l}_Pixel{row_n}.1D"
with open(save_path, "w") as file:
    big_string = '\n'.join([str(item) for item in ts_pixel])
    # you may want to add newline char "\n" at the end, the only difference
    file.write(big_string)

如果您想节省大量时间，只需使用

np.save

，它比您的代码快 100 倍

np.save(save_path, ts_pixel)
# possible saving some space using float instead of double:
np.save(save_path, ts_pixel.astype(np.float32))

如何加快巨型 3D numpy 数组的嵌套 for 循环的处理速度？

问题描述投票：0回答：1

1个回答

最新问题

如何加快巨型 3D numpy 数组的嵌套 for 循环的处理速度？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1