numpy有效地用轴上的连续元素之和替换2d bool数组

问题描述 投票:3回答:3

我有一个bool数组(bool_arr),我想用它们的计数(consecutive_count)替换列上的连续非零数字(这也是连续组的最大/最后一个数字)

bool_arr =            consecutive_count = 
[[1 1 1 1 0 1]        [[3 6 1 6 0 1]
 [1 1 0 1 1 0]         [3 6 0 6 5 0]
 [1 1 1 1 1 1]         [3 6 3 6 5 2]
 [0 1 1 1 1 1]         [0 6 3 6 5 2]
 [1 1 1 1 1 0]         [2 6 3 6 5 0]
 [1 1 0 1 1 1]]        [2 6 0 6 5 1]]

我创建了自己的函数,它获取了列中连续非零元素的累积和

consecutive_cumsum = 
[[1 1 1 1 0 1]
 [2 2 0 2 1 0]
 [3 3 1 3 2 1]
 [0 4 2 4 3 2]
 [1 5 3 5 4 0]
 [2 6 0 6 5 1]]

我目前使用以下内容来获取consecutive_count

bool_arr = np.array([[1,1,1,1,0,1],
                     [1,1,0,1,1,0],
                     [1,1,1,1,1,1],
                     [0,1,1,1,1,1],
                     [1,1,1,1,1,0],
                     [1,1,0,1,1,1]])

consecutive_cumsum = np.array([[1,1,1,1,0,1],
                               [2,2,0,2,1,0],
                               [3,3,1,3,2,1],
                               [0,4,2,4,3,2],
                               [1,5,3,5,4,0],
                               [2,6,0,6,5,1]])

consecutive_count = consecutive_cumsum.copy()
for x in range(consecutive_count.shape[1]):
    maximum = 0
    for y in range(consecutive_count.shape[0]-1, -1, -1):
        if consecutive_cumsum[y,x] > 0:
            if consecutive_cumsum[y,x] < maximum: consecutive_count[y,x] = maximum
            else: maximum = consecutive_cumsum[y,x]
        else: maximum = 0

print(consecutive_count)

它工作得很好,但我正在迭代每个元素以用零之间的最大值替换。

有没有办法使用numpy来矢量化它而不是循环遍历所有元素。作为奖励,指定它将执行它的轴(行与列)

python numpy cumsum
3个回答
3
投票

append新的(v1.15.0我相信)prependnp.diff关键字使这很容易:

bnd = np.diff(bool_arr, axis=0, prepend=0, append=0)
x, y = np.where(bnd.T)
bnd.T[x, y] *= (y[1::2]-y[::2]).repeat(2)
bnd[:-1].cumsum(axis=0)
# array([[3, 6, 1, 6, 0, 1],
#        [3, 6, 0, 6, 5, 0],
#        [3, 6, 3, 6, 5, 2],
#        [0, 6, 3, 6, 5, 2],
#        [2, 6, 3, 6, 5, 0],
#        [2, 6, 0, 6, 5, 1]])

可选轴:

def count_ones(a, axis=-1):
    a = a.swapaxes(-1, axis)
    bnd = np.diff(a, axis=-1, prepend=0, append=0)
    *idx, last = np.where(bnd)
    bnd[(*idx, last)] *= (last[1::2]-last[::2]).repeat(2)
    return bnd[..., :-1].cumsum(axis=-1).swapaxes(-1, axis)

更新:和一般(不仅仅是0/1)条目一起使用的版本:

def sum_stretches(a, axis=-1):
    a = a.swapaxes(-1, axis)
    dtype = np.result_type(a, 'i1')
    bnd = np.diff((a!=0).astype(dtype), axis=-1, prepend=0, append=0)
    *idx, last = np.where(bnd)
    A = np.concatenate([np.zeros((*a.shape[:-1], 1), a.dtype), a.cumsum(axis=-1)], -1)[(*idx, last)]
    bnd[(*idx, last)] *= (A[1::2]-A[::2]).repeat(2)
    return bnd[..., :-1].cumsum(axis=-1).swapaxes(-1, axis)

1
投票

使用itertools.groupby

import itertools

for i in range(b.shape[1]):
    counts = []
    for k,v in itertools.groupby(b[:,i]):
        g = list(v)
        counts.extend([sum(g)] * len(g))    
    b[:,i] = counts   

输出:

array([[3, 6, 1, 6, 0, 1],
       [3, 6, 0, 6, 5, 0],
       [3, 6, 3, 6, 5, 2],
       [0, 6, 3, 6, 5, 2],
       [2, 6, 3, 6, 5, 0],
       [2, 6, 0, 6, 5, 1]])

0
投票

建立在paulpanzer的答案上为可怜的灵魂(像我一样)没有numpy v1.15 +

def sum_stretches(a, axis=-1):
    a = a.swapaxes(-1, axis)
    padding = [[0,0].copy()]*a.ndim
    padding[-1] = [1,1]
    padded = np.pad((a!=0), padding, 'constant', constant_values=0).astype('int32')
    bnd = np.diff(padded, axis=-1)
    *idx, last = np.where(bnd)
    A = np.concatenate([np.zeros((*a.shape[:-1], 1), 'int32'), a.cumsum(axis=-1)], -1)[(*idx, last)]
    bnd[(*idx, last)] *= (A[1::2]-A[::2]).repeat(2)
    return bnd[..., :-1].cumsum(axis=-1).swapaxes(-1, axis)
© www.soinside.com 2019 - 2024. All rights reserved.