randPoints 是 10,000 组 (x,y,z) 坐标的列表 xx, yy, zz 是 meshgrid 的乘积,形状为 (201,201,201)
for j in range(len(randPoints)):
#prepare coordinate x,y,z
xc_small = randPoints[j][0]
yc_small = randPoints[j][1]
zc_small = randPoints[j][2]
#just unit conversion
xcm = xc_small*1e-6
ycm = yc_small*1e-6
zcm = zc_small*1e-6
#calculate f to make sphere of radius 0.5e-6
f = ((xx-xcm)**2+(yy-ycm)**2+(zz-zcm)**2 <= (0.5e-6)**2)
#add 1 sphere into cube n
n = n+f
此循环将继续向 n 添加球体 10,000 次。 对于 10,000 个球体,此 for 循环大约需要 10,000 秒。我想加快计算速度。 n 将是 201201201 体素表示的立方体,其中填充了 10,000 个体素表示的球体。
有关变量的更多详细信息:
x = np.arange(-50, 50, 0.5)
y = np.arange(-50, 50, 0.5)
z = np.arange(0, 2*50, 0.5)
xx, yy, zz = np.meshgrid(x, y, z)
n = np.ones((int(201),int(201),int(201)))
我觉得f的计算效率很低
我尝试了 1.vectorization 一次计算所有 for-loop。但问题是会话崩溃占用太多内存,或者我的矢量化代码不正确。 代码是:
#prepare all (x,y,z) into a vector for one-time computation
xc_small = np.array(randPoints_small)[:, 0]
yc_small = np.array(randPoints_small)[:, 1]
zc_small = np.array(randPoints_small)[:, 2]
#just unit conversion
xcm = xc_small*1e-6
ycm = yc_small*1e-6
zcm = zc_small*1e-6
#one-time computaion of f
f = ((xx[:, :, :, np.newaxis]-xcm)**2 + (yy[:, :, :, np.newaxis]-ycm)**2 + (zz[:, :, :, np.newaxis]-zcm)**2 <= (0.5e-6)**2)
n = np.sum(f, axis=-1)
我还尝试了 2.for 循环计算的并行化,但它并没有变得更快。