我们都知道矩阵转置是多么有用,编写顺序使用的通用算法也没有问题。但是,我在为多线程目的做同样的事情时遇到了一些麻烦,而且只有4 x 4的这个小案例才能正常工作。
我的方法是将double [] []结构的相等部分(在本例中为2 x 2)分配给四个线程中的每一个。在这种情况下,这意味着起始位置为0,0和0,2和2,0和2,2。这是通过“kol”和“rad”传递的。
但是,我不能让它在更大的矩阵上工作,所以任何帮助都会受到赞赏。我发现这个问题的最接近答案是:How to to parallelize the matrix transpose?
这也是我将double [] []结构分为四个部分的灵感。我的(工作)4 x 4代码可以在下面找到,那么如何修改它以适用于四个线程?
干杯!
public double[][] transponerMatrise(double[][] matrise, int rad, int
kol, int id)
{
if((id != 2))
{
for (int i = rad; i < n/2 + rad; i++)
{
for (int j = kol+1; j < n/2 + kol; j++)
{
System.out.println("Traad " + id + " bytter " + i + "," + j + " med " + j + "," + i);
System.out.println("Value is " + matrise[i][j] + ", " + matrise[j][i]);
element = matrise[i][j];
matrise[i][j] = matrise[j][i];
matrise[j][i] = element;
}
}
}
else
{
for (int i = rad; i < n/2 + rad-1; i++)
{
for (int j = kol; j < n/2 + kol; j++)
{
System.out.println("Traad " + id + " bytter " + i + "," + j + " med " + j + "," + i);
System.out.println("Value is " + matrise[i][j] + ", " + matrise[j][i]);
element = matrise[i][j];
matrise[i][j] = matrise[j][i];
matrise[j][i] = element;
}
}
}
return matrise;
}
PS:我知道代码工作正常,因为我有一个针对工作顺序变体的检查方法。
O(N^2)
成本函数的先验失败的斗争如果我可以把你的注意力转向一种更聪明的方法,只需要几乎O(1)
(恒定)成本,这个技巧将帮助你开始以更有前途的方式开展工作。
可以尝试在高性能调整(缓存行友好的“原始” - 矩阵元素的存储之上添加一个薄抽象层。这个抽象层将有助于访问“原始” - 存储(使用索引轴,索引 - 大步和切片技巧 - 就像HPC FORTRAN库激发了众所周知的numpy
和它的大步技巧一样)这样的方式.T
方法将不会做任何昂贵的事情(类似推土机N^2
内存位置,在那里交换)但只是简单在抽象层中交换轴 - 参数,负责间接映射器,对于任何大的matrise[N,N]; where N = 10, 100, 1000, 10000, 100000, 1000000, 10000000+
仍然有几个[ns]
需要几十纳秒。
没有什么比这更快或更复杂的矩阵操作更快,FORTRAN和numpy
,性能抛光,方法都是这种观察的证明。
也许使用线程将行交换为col可能是一个简单的想法,但因此您需要两倍于矩阵的内存,这可能是巨大矩阵的问题。如果你有一个6核CPU我认为使用100线程没有太大的好处所以我的线程池非常小。正如@ user3666197提到它仍然是一个昂贵的解决方案 - 但是并列;-)
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
public class MatrixTransposition {
public static void main(final String[] args) throws InterruptedException {
final MatrixTransposition transposition = new MatrixTransposition();
final int[][] source = transposition.create(32);
final int[][] transposed = transposition.solve(source);
System.out.println("Compare source and transpositon = " + transposition.compare(source, transposed));
final int[][] result = transposition.solve(transposed);
System.out.println("Compare source and double transpositon = " + transposition.compare(source, result));
transposition.print(source);
transposition.print(transposed);
}
public boolean compare(final int[][] a, final int[][] b) {
for (int r = 0; r < a.length; r++) {
for (int c = 0; c < a[0].length; c++) {
if (a[r][c] != b[r][c]) return false;
}
}
return true;
}
public int[][] create(final int size) {
final int[][] result = new int[size][size];
for (int r = 0; r < size; r++) {
for (int c = 0; c < size; c++) {
result[r][c] = r * size + c;
}
}
return result;
}
public void print(final int[][] input) {
final int size = input.length;
final int maxNr = size * size;
final int digits = new String(maxNr + "").length();
final String cellFormat = "%0" + digits + "d ";
for (int r = 0; r < input.length; r++) {
final int[] row = input[r];
for (final int c : row) {
System.out.print(String.format(cellFormat, c));
}
System.out.println("");
}
System.out.println("");
}
public int[][] solve(final int[][] input) throws InterruptedException {
final int width = input.length;
final int height = input[0].length;
final int[][] result = new int[width][height];
final CountDownLatch latch = new CountDownLatch(width);
for (int r = 0; r < width; r++) {
final int row = r;
threadPool.execute(() -> {
solvePart(result, input, row);
latch.countDown();
});
}
latch.await();
return result;
}
private void solvePart(final int[][] result, final int[][] input, final int r) {
System.out.println("Solve row " + String.format("%02d", r) + " in thread " + Thread.currentThread().getName());
final int[] row = input[r];
for (int c = 0; c < row.length; c++) {
result[c][r] = row[c];
}
}
private final ExecutorService threadPool = Executors.newFixedThreadPool(6);
}
基于user3666197的方法,你可以做类似的事情:
public class Matrix {
private class Index {
Index(final int row, final int col) {
super();
this.row = row;
this.col = col;
}
int col;
int row;
}
public Matrix(final int rows, final int cols) {
this.rows = rows;
this.cols = cols;
data = new int[rows][cols];
}
public int get(final int row, final int col) {
return get(getIndex(row, col));
}
public void set(final int row, final int col, final int value) {
set(getIndex(row, col), value);
}
public void transpose() {
transpositioned = !transpositioned;
}
private int get(final Index index) {
return data[index.row][index.col];
}
private Index getIndex(final int row, final int col) {
return transpositioned ? new Index(col, row) : new Index(row, col);
}
private void set(final Index index, final int value) {
data[index.row][index.col] = value;
}
private final int cols;
private final int[][] data;
private final int rows;
private boolean transpositioned;
}