使用Python获得线性方程组的解，解中包含尽可能多的零

Question

我正在尝试用一个非常稀疏、相对较大的矩阵（大小约为

10 x 10000

）来求解 Python 中的线性方程组。使用 Numpy，可以使用最小二乘法轻松获得数值解。然而，它几乎没有零，这意味着结果是数千个向量的线性组合。

我也尝试过使用SciPy，这样我就可以尝试修改参数以最大化零的数量，但我根本不了解它，并且不确定是否有办法用这么多变量来做到这一点立刻。

基本上，我的问题是：有没有办法（1）向 Numpy 指定我希望解决方案中有尽可能多的零（而不是通常的范数最小化）；或者 (2) 直接在 SciPy 中解决大型系统？

预先感谢您的帮助！

Answer 1

这可以被认为是一个分配问题，其中解矩阵的每个元素要么被分配（非零），要么未被分配（零）。在这种规模下，密度为 1%、维度为 m=10、n=10_000 的稀疏矩阵可以快速收敛到具有 p*n 非零值的解。如果它更加稀疏，优化器可以对此进行改进。

import numpy as np
import scipy.sparse as sp
from scipy.optimize import milp, Bounds, LinearConstraint


def fake_data(
    rand: np.random.Generator,
    m: int, n: int, p: int,
    density: float = 0.1,
) -> tuple[sp.csc_array, np.ndarray]:
    A = sp.lil_array((m, n))
    n_A = int(density * m * n)
    A[
        rand.integers(low=0, high=m, size=n_A),
        rand.integers(low=0, high=n, size=n_A),
    ] = rand.uniform(low=-1e-3, high=1e-3, size=n_A)

    x_ideal = rand.uniform(low=-1e-3, high=1e-3, size=(n, p))

    b = A @ x_ideal
    return A.tocsc(), b


def solve(
    A: sp.csc_array,
    b: np.ndarray,
    density_objective: bool = False,
) -> np.ndarray:
    m, n = A.shape
    m, p = b.shape

    # Variables: x (p*n), continuous linear solution; a (p*n), binary assignments
    b_flat = b.T.ravel()
    product_constraint = LinearConstraint(
        A=sp.hstack((
            sp.kron(A=sp.eye_array(p), B=A),
            sp.csc_array((m*p, n*p)),
        ), format='csc'),
        lb=b_flat, ub=b_flat,
    )

    # x <= a, -x <= a  -> -x + a >= 0, x + a >= 0
    assign_constraint = LinearConstraint(
        A=sp.block_array([
            [-sp.eye(p*n), sp.eye(p*n)],
            [sp.eye(p*n), sp.eye(p*n)],
        ]),
        lb=0,
    )

    sparsity_constraint = LinearConstraint(
        A=sp.hstack((
            sp.csc_matrix((1, p*n)),
            np.ones((1, p*n)),
        )),
        ub=m*p,
    )

    result = milp(
        c=np.hstack((np.zeros(n*p), np.ones(n*p)))  # assignments (i.e. number of non-zeros) minimised
        if density_objective
        else np.zeros(2*p*n),  # nothing minimised
        integrality=np.hstack((
            np.zeros(n*p, dtype=np.uint8),   # x is continuous
            np.ones(n*p, dtype=np.uint8),    # assignments are binary
        )),
        bounds=Bounds(
            lb=np.hstack((
                np.full(p*n, fill_value=-np.inf),  # solution non-bounded
                np.zeros(p*n),                     # assignments binary
            )),
            ub=np.hstack((
                np.full(p*n, fill_value=np.inf),
                np.ones(p*n),
            )),
        ),
        constraints=(product_constraint, assign_constraint, sparsity_constraint),
    )
    if not result.success:
        raise ValueError(result.message)
    return result.x


def main() -> None:
    rand = np.random.default_rng(seed=0)
    m = 10
    n = 10_000
    p = 2
    A, b = fake_data(rand, m=m, n=n, p=p, density=0.01)
    result = solve(A, b)

    x = result[:p*n].reshape((p, n)).T
    assign = result[p*n:].reshape((p, n)).T
    error = A@x - b
    n_nonzero = round(assign.sum(), 6)
    print('Nonzeros:', n_nonzero, '~', np.count_nonzero(np.round(x, 9)))
    print(f'Error: {error.min():.1e} - {error.max():.1e}')


if __name__ == '__main__':
    main()

Nonzeros: 20.0 ~ 20
Error: -2.5e-19 - 9.3e-20

使用Python获得线性方程组的解，解中包含尽可能多的零

问题描述投票：0回答：1

1个回答

最新问题

使用Python获得线性方程组的解，解中包含尽可能多的零

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1