通过向量乘法和Parallel.For
循环实现此矩阵后,我需要使用Kahan Summation对结果求和以减少数值误差。可以吗?
这是我的 Local Final 实现:
double[] qTotal = new double[order];
double[] c = new double[order];
Parallel.For(0,
order,
() => new double[order], // LocalInit
(i, loopState, q) => // Main Body
{
int indexFirst = I[i];
q[i] += A[indexFirst] * d[J[indexFirst]];
for (int j = indexFirst + 1; j < I[i + 1]; j++)
{
int col = J[j];
double a = A[j];
q[i] += a * d[col];
q[col] += a * d[i];
}
return q;
},
q => // Local Finally
{
lock (qTotal)
{
// Kahan Sum
for (int i = 0; i < q.Length; i++)
{
double y = q[i] - c[i]; // So far, so good: c is zero.
double t = qTotal[i] + y; // Alas, sum is big, y small, so low-order digits of y are lost.
c[i] = (t - qTotal[i]) - y; // (t - sum) recovers the high-order part of y; subtracting y recovers -(low part of y)
qTotal[i] = t; // Algebraically, c should always be zero. Beware overly-aggressive optimizing compilers!
// Next time around, the lost low part will be added to y in a fresh attempt.
}
}
}
});
似乎没有按预期工作,结果与非并行版本略有不同。我是不是做错了什么?
如果您认为累积中不会有很多线程冲突(您正在执行矩阵乘法,因此可能不会),您可以尝试使用 Interlock 而不是 lock。互锁速度更快,并且会节省一些内存。由于您将修改 2 个数字,因此您将需要一个门。 Interlock 不支持布尔值,因此您必须使用整数。
写一个这样的函数:
private static void ThreadSafeKahanSum(ref double Value, ref double C, ref int gate, double addend)
{
while (Interlocked.CompareExchange(ref gate, 1, 0) != 0)
{
//You should not get inside of here very often.
//You can keep a collision tally for research if you like:
//Interlocked.Increment(ref kahanSumCollisionCount);
//If you are getting a lot of collisions, then maybe find another way.
//Or just leave it blank.
}
double y = addend - C;
double t = Value + y;
C = (t - Value) - y;
Value = t;
gate = 0;
}
现在将您的代码重写为如下所示:
double[] qTotal = new double[order];
double[] c = new double[order];
int[] gates = new int[order];
Parallel.For(0, order,
(i, loopState) => // Main Body
{
int indexFirst = I[i];
ThreadSafeKahanSum(ref qTotal[i], ref c[i], ref gates[i], A[indexFirst] * d[J[indexFirst]]);
for (int j = indexFirst + 1; j < I[i + 1]; j++)
{
int col = J[j];
double a = A[j];
ThreadSafeKahanSum(ref qTotal[i], ref c[i], ref gates[i], a * d[col]);
ThreadSafeKahanSum(ref qTotal[col], ref c[col], ref gates[col], a * d[i]);
}
});
通过这种方式,您可以以线程安全的方式充分利用 Kahan sum。
如果有人有更好的方法请告诉我。多年来我一直面临着与 Kahan sum 相同的问题,这是我想出的最好的 Kahan sum 线程安全方法。