沿着ndarray的轴具有多个参数的函数的有效映射

问题描述 投票:0回答:1

我有两个ndarray-一个用于values,另一个用于weights(源自values的错误)。我对获得axis=1 ndarray沿values的平均值和标准偏差很感兴趣。为了清楚起见,这是一个可以完成此任务的玩具结构。

values=[[0.25,0.34,0.28,0.54],[0.23,0.38,0.29,0.55],[0.21,0.36,0.31,0.56]] 
errors=[[0.02,0.01,0.03,0.01],[0.01,0.02,0.03,0.01],[0.04,0.03,0.01,0.02]] 

def invsqerr(x):
    return 1/x**2

invsqerr_arr=np.apply_along_axis(invsqerr, 1, errors)


def wavg_std(y_arr, invsqerr_arr):
    average = np.average(y_arr, weights=invsqerr_arr)
    variance = np.average((y_arr-average)**2, weights=invsqerr_arr)
    return (average, math.sqrt(variance))


for k in range(len(values[0])):
    print (wavg_std([i[k] for i in values], [i[k] for i in invsqerr_arr]))

输出:

(0.23285714285714285, 0.009331389496316869)
(0.34897959183673471, 0.015681120581468193)
(0.30545454545454542, 0.009875254992000192)
(0.54666666666666663, 0.006666666666666672)

就我而言,len(values[0])(请参阅for循环)约为几百万。对于这么大的数组,for loop似乎不是正确的方法。

也许寻求一种基于np.apply_along_axis的有效方法来处理多个参数。

python performance apply
1个回答
0
投票

这里是利用Python 3的multiprocessing模块解决此问题的有效方法。

首先,我将ndarrays valuesweights换位:

values=np.stack(values).transpose()
weights=np.stack(weights).transpose()

然后使用starmap

if __name__ == '__main__':
    with multiprocessing.Pool() as pool:
        results = pool.starmap(wavg_std, zip(values, weights))
© www.soinside.com 2019 - 2024. All rights reserved.