对 2D 数组的重叠矩形切片求和

问题描述 投票:0回答:1

我需要计算多个总和,每个总和位于二维数组的轴定向矩形切片上。许多切片会重叠,因此总和中会有许多共享项。为了减少所需的计算,如何自动分割总和,以便单独计算共享项的总和,然后将其包含在它们出现的总和中?

对于我的应用程序,算法的速度并不是很关键,因为确定分割将在 GPU 上计算之前在编译时完成(使用 JAX 框架)。二维数组中的值在编译时未知。

我已经解决了问题并打算立即回复。

algorithm sum slice rectangles compile-time
1个回答
0
投票

这是一个 Python 函数,它将给定矩形覆盖的区域分割为子矩形,每个子矩形与多个输入矩形完全重叠。该算法首先列出矩形角坐标中出现的所有 x 和 y 坐标,并对每个列表进行排序。然后它可以更紧凑地使用这些列表的索引来对每个 x 和 y 进行编码。该方法的灵感来自于“这个答案”。在由 x 和 y 编码索引的 2D 数组中,该算法使用 64 位二进制表示形式列出哪些区域被哪些输入矩形覆盖。 (使用累积和有效地填充 2D 数组。)如果输入矩形超过 64 个,则使用多个此类数组。然后扫描数组以查找子矩形的左上角。每个子矩形都会先水平放大,然后垂直放大,直到它不再只包含相同的值。 该算法不一定提供最小分割数。

包含打印语句以更轻松地遵循算法。它们可以安全地移除。

import numpy as np # Split axis oriented input rectangles and their overlaps to subrectangles that are each either fully covered or not at all covered by an input rectangle # Input should be a list of tuples ((start_y, start_x), (end_y, end_x)) where the values must be of sortable type # All returned subrectangles are covered by at least one input rectangle # Returns a list of subrectangle tuples ((start_y, start_x), (end_y, end_x), indexes)) where indexes is a tuple of indexes to the input list indicating the input rectangles that cover the subrectangle def get_subrectangles(rectangles): # Find x and y values that appear in the rectangle coordinates ys = sorted(set([rectangle[0][0] for rectangle in rectangles] + [rectangle[1][0] for rectangle in rectangles])) xs = sorted(set([rectangle[0][1] for rectangle in rectangles] + [rectangle[1][1] for rectangle in rectangles])) y_dict = {} x_dict = {} for index, x in enumerate(xs): x_dict[x] = index for index, y in enumerate(ys): y_dict[y] = index print("\nys:") print(ys) print("xs:") print(xs) print("y_dict:") print(y_dict) print("x_dict:") print(x_dict) # Use binary representation to mark areas covered by each rectangle membership = np.zeros([len(ys), len(xs), ((len(rectangles) - 1) >> 6) + 1], dtype=np.int64) for index, rectangle in enumerate(rectangles): signature_int64 = 1 << np.int64(index & 0b111111) signature_table = index >> 6 with np.errstate(over='ignore'): print(f"membership[{y_dict[rectangle[0][0]]}, {x_dict[rectangle[0][1]]}, {signature_table}] += {signature_int64}") print(f"membership[{y_dict[rectangle[1][0]]}, {x_dict[rectangle[0][1]]}, {signature_table}] -= {signature_int64}") print(f"membership[{y_dict[rectangle[1][0]]}, {x_dict[rectangle[1][1]]}, {signature_table}] += {signature_int64}") print(f"membership[{y_dict[rectangle[0][0]]}, {x_dict[rectangle[1][1]]}, {signature_table}] -= {signature_int64}") membership[y_dict[rectangle[0][0]], x_dict[rectangle[0][1]], signature_table] += signature_int64 membership[y_dict[rectangle[1][0]], x_dict[rectangle[0][1]], signature_table] -= signature_int64 membership[y_dict[rectangle[1][0]], x_dict[rectangle[1][1]], signature_table] += signature_int64 membership[y_dict[rectangle[0][0]], x_dict[rectangle[1][1]], signature_table] -= signature_int64 print("2D cumsum") membership = membership.cumsum(axis=0).cumsum(axis=1) # Find subrectangles print("\nmembership: (table axis moved first for convenience)") print(np.moveaxis(membership, -1, 0)) subrectangles = [] for start_y in range(membership.shape[0] - 1): for start_x in range(membership.shape[1] - 1): # Use top left as the membership signature that the rest of the rectangle must match signature = membership[start_y, start_x, :] if np.any(signature != 0): # First enlarge the rectangle horizontally end_x = start_x + 1 while np.all(membership[start_y, end_x, :] == signature): end_x += 1 # Then enlarge the rectangle vertically end_y = start_y + 1 while np.all(membership[end_y, start_x:end_x, :] == signature): end_y += 1 indexes = [] for index in range(len(rectangles)): signature_int64 = 1 << np.int64(index & 0b111111) signature_table = index >> 6 if signature_int64 & signature[signature_table]: indexes.append(index) if len(indexes): subrectangles.append(((ys[start_y], xs[start_x]), (ys[end_y], xs[end_x]), tuple(indexes))) membership[start_y:end_y, start_x:end_x, :] = 0 print("membership: (table axis moved first for convenience)") print(np.moveaxis(membership, -1, 0)) return subrectangles

例如,两个矩形的输入列表,每个元素都是表示左上角和右下角坐标的元组:

[((1, 0), (6, 9)), ((3, 8), (9, 10))]

产生以下子矩形列表:

[((1, 0), (3, 9), (0,)), ((3, 0), (6, 8), (0,)), ((3, 8), (6, 9), (0, 1)), ((3, 9), (9, 10), (1,)), ((6, 8), (9, 9), (1,))]

每个元组中的第三个元组给出了子矩形完全重叠的原始矩形的索引。

然后可以使用它来计算总和,如下所示。包括结果验证:

# Split multiple sums over slices of data to slices with shared terms # Start (included) and end (excluded) coordinates for rectangles, over which sums should be calculated from random import randint, seed def randStartEnd(N): a = randint(0, N) b = randint(0, N) if a > b: a, b = b, a if a == b: if a == 0: b = 1 elif b == N: a = N-1 else: b = a + 1 return a, b # Configuration M = 2 # number of rectangles N = 10 # width and height seed(46) np.random.seed(47) data = np.random.randint(0, 10, (N, N)) print("\ndata:") print(data) rectangles = [] sums_validation = np.zeros((M,)) sums = np.zeros((M,)) orig_work = 0 work = 0 for i in range(M): start_y, end_y = randStartEnd(N) start_x, end_x = randStartEnd(N) rectangles.append(((start_y, start_x), (end_y, end_x))) sums_validation[i] = np.sum(data[start_y:end_y, start_x:end_x]) orig_work += (end_y - start_y)*(end_x - start_x) print("rectangles:") print(rectangles) subrectangles = get_subrectangles(rectangles) print("\nsubrectangles:") print(subrectangles) print("Contributions of subrectangle sums to rectangle sums:") for subrectangle in subrectangles: start_y = subrectangle[0][0] start_x = subrectangle[0][1] end_y = subrectangle[1][0] end_x = subrectangle[1][1] work += (end_y - start_y)*(end_x - start_x) for index in subrectangle[-1]: print(f"sums[{index}] += np.sum(data[{start_y}:{end_y}, {start_x}:{end_x}])") sums[index] += np.sum(data[start_y:end_y, start_x:end_x]) print(f"Calculated sums over {100*work/orig_work} % of terms of the original sums") print("sums:") print(sums) print("sums_validation:") print(sums_validation) print("diff:") print(sums_validation - sums)

打印输出示例:

data: [[7 6 7 8 8 3 0 7 0 7] [7 1 7 2 2 1 7 4 8 9] [2 9 1 5 0 9 2 0 2 1] [4 3 4 5 9 9 6 6 1 5] [3 9 8 9 4 3 5 6 4 3] [1 4 4 2 2 0 9 0 1 8] [4 0 8 9 6 6 9 5 2 2] [9 2 8 8 5 8 0 4 4 6] [7 4 8 1 7 8 9 8 9 6] [1 7 6 0 1 9 2 2 7 8]] rectangles: [((1, 0), (6, 9)), ((3, 8), (9, 10))] ys: [1, 3, 6, 9] xs: [0, 8, 9, 10] y_dict: {1: 0, 3: 1, 6: 2, 9: 3} x_dict: {0: 0, 8: 1, 9: 2, 10: 3} membership[0, 0, 0] += 1 membership[2, 0, 0] -= 1 membership[2, 2, 0] += 1 membership[0, 2, 0] -= 1 membership[1, 1, 0] += 2 membership[3, 1, 0] -= 2 membership[3, 3, 0] += 2 membership[1, 3, 0] -= 2 2D cumsum membership: (table axis moved first for convenience) [[[1 1 0 0] [1 3 2 0] [0 2 2 0] [0 0 0 0]]] membership: (table axis moved first for convenience) [[[0 0 0 0] [1 3 2 0] [0 2 2 0] [0 0 0 0]]] membership: (table axis moved first for convenience) [[[0 0 0 0] [0 3 2 0] [0 2 2 0] [0 0 0 0]]] membership: (table axis moved first for convenience) [[[0 0 0 0] [0 0 2 0] [0 2 2 0] [0 0 0 0]]] membership: (table axis moved first for convenience) [[[0 0 0 0] [0 0 0 0] [0 2 0 0] [0 0 0 0]]] membership: (table axis moved first for convenience) [[[0 0 0 0] [0 0 0 0] [0 0 0 0] [0 0 0 0]]] subrectangles: [((1, 0), (3, 9), (0,)), ((3, 0), (6, 8), (0,)), ((3, 8), (6, 9), (0, 1)), ((3, 9), (9, 10), (1,)), ((6, 8), (9, 9), (1,))] Contributions of subrectangle sums to rectangle sums: sums[0] += np.sum(data[1:3, 0:9]) sums[0] += np.sum(data[3:6, 0:8]) sums[0] += np.sum(data[3:6, 8:9]) sums[1] += np.sum(data[3:6, 8:9]) sums[1] += np.sum(data[3:9, 9:10]) sums[1] += np.sum(data[6:9, 8:9]) Calculated sums over 94.73684210526316 % of terms of the original sums sums: [190. 51.] sums_validation: [190. 51.] diff: [0. 0.]

© www.soinside.com 2019 - 2024. All rights reserved.