我的数据通常采用以下形式:
data.shape = (N, a, b, c, ...)
,其中
a, b, c, ...
提前未知),并且
coefs.shape = (N,)
), 我需要沿第一个轴广播。
import numpy as np
N = 10
sub_array_shape = 3, 3
# Random data representing arbitrary data from external source
raw_data = np.stack([np.random.randn(*sub_array_shape) for i in range(N)])
corrections = np.random.randn(N)
现在我需要将两个数组一起广播以将校正应用于我的数据。
我发现两种有效的方法:
corrected_data = (raw_data.T + corrections).T
corrected_data = raw_data + corrections.reshape(-1, *(1,) * (raw_data.ndim - 1))
我主要关心的是代码的可读性,但我也感兴趣不同方法之间是否存在性能差异。
所以有 2 个数组:
In[41]: x = np.ones((3,2,4,5),int)
In [42]: y = np.arange(3)
对我来说突出的广播调整语法是:
In [43]: z=y[:,None,None,None]
In [45]: (x*z).shape
Out[45]: (3, 2, 4, 5)
但是它不适合可变尺寸。
reshape
以编程方式更容易完成,如您的 (2) 所示:
In [46]: z=y.reshape(-1,1,1,1); z.shape
Out[46]: (3, 1, 1, 1)
In [47]: (x*z).shape
Out[47]: (3, 2, 4, 5)
我本来打算建议broadcast_to
,但这需要那些尾随尺寸:
In [49]: z = np.broadcast_to(y,x.shape); z.shape
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[49], line 1
----> 1 z = np.broadcast_to(y,x.shape); z.shape
File <__array_function__ internals>:200, in broadcast_to(*args, **kwargs)
File ~\miniconda3\lib\site-packages\numpy\lib\stride_tricks.py:413, in broadcast_to(array, shape, subok)
367 @array_function_dispatch(_broadcast_to_dispatcher, module='numpy')
368 def broadcast_to(array, shape, subok=False):
369 """Broadcast an array to a new shape.
370
371 Parameters
(...)
411 [1, 2, 3]])
412 """
--> 413 return _broadcast_to(array, shape, subok=subok, readonly=True)
File ~\miniconda3\lib\site-packages\numpy\lib\stride_tricks.py:349, in _broadcast_to(array, shape, subok, readonly)
346 raise ValueError('all elements of broadcast shape must be non-'
347 'negative')
348 extras = []
--> 349 it = np.nditer(
350 (array,), flags=['multi_index', 'refs_ok', 'zerosize_ok'] + extras,
351 op_flags=['readonly'], itershape=shape, order='C')
352 with it:
353 # never really has writebackifcopy semantics
354 broadcast = it.itviews[0]
ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (3,) and requested shape (3,2,4,5)
如果它具有必要的尾随尺寸,那么它可以使用 0 步幅执行第二个广播步骤:
In [50]: w = np.broadcast_to(z,x.shape); w.shape
Out[50]: (3, 2, 4, 5)
In [51]: w.strides
Out[51]: (4, 0, 0, 0)
也可以使用
expand_dims
,但在内部我相信它使用
reshape
。该代码可能具有指导意义:
In [52]: np.expand_dims(y,(1,2,3)).shape
Out[52]: (3, 1, 1, 1)
broadcast_to
可以与“反向”形状一起使用来添加前导尺寸,然后转回原处。但这只是您的第一个想法:
In [54]: np.broadcast_to(y, x.T.shape).shape
Out[54]: (5, 4, 2, 3)
In [55]: np.broadcast_to(y, x.T.shape).T.shape
Out[55]: (3, 2, 4, 5)
In [56]: np.broadcast_to(y, x.T.shape).T.strides
Out[56]: (4, 0, 0, 0)