以 None 作为索引的 Python/Numpy 数组

Question

我目前正在学习 Python/Numpy，但我完全不明白 None 作为数组索引的含义。我想计算两个二维数据集 x 和 y 之间的欧氏距离。输出应该是一个包含欧氏距离的二维数组

x[0] and y[0], x[0] and y[1], x[0] and y[2],..., 
x[1] and y[0], x[1] and y[1], x[1] and y[2], and so on.

解决方案是：

np.sqrt(np.sum((x[:, None] - y[None])**2, -1)

但是我不明白

x[:, None]

和

y[None]

的区别？它们是如何重塑的？我只是无法在脑海中想象它。有人可以帮忙吗？

Answer 1

让我们用一个简单的例子：

x = np.array([10, 20, 30])
y = np.array([1, 2])

x.shape
# (3,)

y.shape
# (3,)

使用

x[:, None]

增加一个额外的维度：

x2 = x[:, None]

x2.shape
# (3, 1)

x2
# array([[10],
#        [20],
#        [30]])

使用

y[None]

预定义一个额外的维度：

y2 = y[None]

y2.shape
# (1, 3)

y2
# array([[1, 2]])

现在当你对两者执行操作时，两个数组被广播：

x[:, None] - y[None]

# array([[ 9,  8],
#        [19, 18],
#        [29, 28]])

这相当于：

array([[10, 10],         array([[1, 2],
       [20, 20],    -           [1, 2],
       [30, 30]])               [1, 2]])

Answer 2

也许你应该尝试

.shape

并像这样打印它：

import numpy as np
x = np.random.random(size=10)

print(x.shape)
print(x[None].shape)
print(x[:,None].shape)
print(x[:,None,None].shape)

输出应该是：

>>(10,)
>>(1, 10)
>>(10, 1)
>>(10, 1, 1)

所以一般来说，

None

会在对应的索引处增加一个维度。此外，

print(x[None].shape)

的产量与

print(x[None,:].shape)

相同。