每行Numpy动态切片

问题描述 投票:2回答:1

如何在不使用for循环的情况下动态切片给定起始和结束索引的每一行。我可以使用下面列出的循环来完成它,但对于x.shape [0]> 1 mill的东西来说,它太慢了

x= np.arange(0,100)   
x = x.reshape(20,5)
s_idx = np.random.randint(0,3,x.shape[0])
e_idx = np.random.randint(3,6,x.shape[0])

print(s_idx)
>>> array([2, 1, 2, ..., 1, 0, 2])  

print(e_idx)
>>> array([3, 4, 5, ..., 3, 3, 3]) 

print(x)
>>> array([[ 0,  1,  2,  3,  4],
[ 5,  6,  7,  8,  9], 
[10, 11, 12, 13, 14], 
...,   
[85, 86, 87, 88, 89], 
[90, 91, 92, 93, 94], 
[95, 96, 97, 98, 99]])   

x_indexed = []
for idx,value in enumerate(s_idx):   
    x_indexed.append(x[idx][s_idx[idx]:e_idx[idx]])

print(x_indexed)
>>> [array([2]),
     array([6, 7, 8]),
     array([12, 13, 14]),
     array([15, 16, 17]),
     array([20, 21, 22, 23]),
     array([26, 27, 28, 29]),
     array([30, 31, 32, 33]),
     array([35, 36, 37, 38, 39]),
     array([40, 41, 42]),
     array([46, 47, 48]),
     array([52, 53, 54]),
     array([56, 57]),
     array([62, 63, 64]),
     array([67]),
     array([70, 71, 72, 73]),
     array([77]),
     array([80, 81, 82, 83, 84]),
     array([86, 87]),
     array([90, 91, 92]),
     array([97])]
python numpy dynamic slice
1个回答
1
投票

你可以使用masked arrays

import numpy as np

np.random.seed(100)

x = np.arange(0, 100)
x = x.reshape(20, 5)
s_idx = np.random.randint(0, 3, x.shape[0])
e_idx = np.random.randint(3, 6, x.shape[0])

# This is optional, reduce x to the minimum possible block
first_col, last_col = s_idx.min(), e_idx.max()
x = x[:, first_col:last_col]
s_idx -= first_col
e_idx -= first_col

col_idx = np.arange(x.shape[1])
# Mask elements out of range
mask = (col_idx < s_idx[:, np.newaxis]) | (col_idx >= e_idx[:, np.newaxis])
x_masked = np.ma.array(x, mask=mask)
print(x_masked)

输出:

[[0 1 2 3 --]
 [5 6 7 8 9]
 [10 11 12 13 14]
 [-- -- 17 -- --]
 [-- -- 22 -- --]
 [25 26 27 28 --]
 [-- -- 32 33 --]
 [-- 36 37 38 --]
 [-- -- 42 -- --]
 [-- -- 47 -- --]
 [-- -- 52 53 --]
 [-- -- 57 58 --]
 [-- 61 62 63 --]
 [65 66 67 68 69]
 [70 71 72 -- --]
 [75 76 77 78 79]
 [80 81 82 83 --]
 [-- -- 87 88 --]
 [90 91 92 93 94]
 [-- 96 97 98 99]]

您可以使用屏蔽数组执行大多数NumPy操作,但如果您仍然需要数组列表,则可以执行以下操作:

list_arrays = [row[~m] for row, m in zip(x, x_masked.mask)]
print(list_arrays)

输出:

[array([0, 1, 2, 3]),
 array([5, 6, 7, 8, 9]),
 array([10, 11, 12, 13, 14]),
 array([17]),
 array([22]),
 array([25, 26, 27, 28]),
 array([32, 33]),
 array([36, 37, 38]),
 array([42]),
 array([47]),
 array([52, 53]),
 array([57, 58]),
 array([61, 62, 63]),
 array([65, 66, 67, 68, 69]),
 array([70, 71, 72]),
 array([75, 76, 77, 78, 79]),
 array([80, 81, 82, 83]),
 array([87, 88]),
 array([90, 91, 92, 93, 94]),
 array([96, 97, 98, 99])]

虽然在这种情况下显然你不需要构造中间掩码数组,但你可以遍历xmask的行。

© www.soinside.com 2019 - 2024. All rights reserved.