获取大稀疏矩阵的范数

Question

我有一个非常大的 500,000 X 500,000 但稀疏矩阵。我想使用Python找到它的规范，我尝试使用：

%timeit numpy.linalg.norm(a, axis=1)

其中

是矩阵

我也尝试过这个：

b = numpy.asarray(a, numpy.float32)
numpy.linalg.norm(b, axis=1)

但是 Google Colab 总是崩溃。

我也尝试这样做：

import numpy as np
a =  np.random.rand(1000000,100)
print(np.linalg.norm(a, axis =1).shape)

def g(d, out=None):
    bs = 2000
    if out is None:
        r = np.empty(d.shape[0])
    else:
        r = out
    for i in range(0, d.shape[0], bs):
        u = min(i + bs, d.shape[0])
        r[i:u] = np.linalg.norm(d[i:u], axis=1)
    return r


print((g(a) == numpy.linalg.norm(a, axis =1)).all())
print("blocked")
%timeit -n 10 g(a)
print("normal")
%timeit -n 10 numpy.linalg.norm(a, axis =1)

来自之前的堆栈溢出问题之一，但内核仍然崩溃，有什么办法可以做到这一点？

Answer 1

如果我制作一个适度大小的稀疏矩阵：

In [109]: M = sparse.random(10,100,.2, 'csr'); M
Out[109]: 
<10x100 sparse matrix of type '<class 'numpy.float64'>'
    with 200 stored elements in Compressed Sparse Row format>

我不能在上面使用

np.linalg.norm

:

In [110]: np.linalg.norm(M,axis=1)
---------------------------------------------------------------------------
AxisError                                 Traceback (most recent call last)
Cell In[110], line 1
----> 1 np.linalg.norm(M,axis=1)

File <__array_function__ internals>:200, in norm(*args, **kwargs)

File ~\miniconda3\lib\site-packages\numpy\linalg\linalg.py:2542, in norm(x, ord, axis, keepdims)
   2539 elif ord is None or ord == 2:
   2540     # special case for speedup
   2541     s = (x.conj() * x).real
-> 2542     return sqrt(add.reduce(s, axis=axis, keepdims=keepdims))
   2543 # None of the str-type keywords for ord ('fro', 'nuc')
   2544 # are valid for vectors
   2545 elif isinstance(ord, str):

AxisError: axis 1 is out of bounds for array of dimension 0

但我可以用其密集的等价物来做到这一点。

M.A

尺寸较大时会产生内存错误。

M

In [111]: np.linalg.norm(M.A,axis=1)
Out[111]: 
array([2.08644827, 2.61130439, 2.69501798, 2.33149559, 2.53580021,
       3.35499087, 2.14149997, 1.45864564, 2.56890956, 2.76315379])

有一个适用于

sparse

的规范：

M

对于这个

In [112]: sparse.linalg.norm(M,axis=1)
Out[112]: 
array([2.08644827, 2.61130439, 2.69501798, 2.33149559, 2.53580021,
       3.35499087, 2.14149997, 1.45864564, 2.56890956, 2.76315379])

范数我可以直接进行稀疏计算：

L2

这个和产生一个 (1,10) 稠密矩阵。使用

In [113]: M.power(2).sum(axis=1)
Out[113]: 
matrix([[ 4.35326638],
        [ 6.81891059],
        [ 7.2631219 ],
        [ 5.43587169],
        [ 6.43028271],
        [11.25596371],
        [ 4.58602212],
        [ 2.1276471 ],
        [ 6.59929632],
        [ 7.63501886]])

制作一维数组，我得到相同的数字：

A1

无论如何，我们需要一个临时数组/矩阵，它的大小足以容纳平方值。总和减少了维度等。

稀疏数组

如果我做了

In [114]: M.power(2).sum(axis=1).A1**.5
Out[114]: 
array([2.08644827, 2.61130439, 2.69501798, 2.33149559, 2.53580021,
       3.35499087, 2.14149997, 1.45864564, 2.56890956, 2.76315379])

（

A1

正在缓慢移动），我就不需要

sparse_array

步骤。

sparse

获取大稀疏矩阵的范数

问题描述投票：0回答：1

1个回答

如果我做了

最新问题

获取大稀疏矩阵的范数

问题描述 投票：0回答：1

1个回答

如果我做了

最新问题

问题描述投票：0回答：1