当 cufftPlanMany

Question

还有其他原因导致CUFFT_INTERNAL_ERROR发生吗？

我对每组相同大小的输入和不同批量大小进行 cuFFT2D。
输入数组大小为 360（行）x90（列），批量大小通常为 10（有时高达 100）。
但我在某个集合处收到“CUFFT_INTERNAL_ERROR”（在我的例子中为 640..）。

每组的输入没有特别的差异。然而，当使用相同的输入数据时，上述错误总是发生在同一组中。

我之前用的是12.0，但是我以为是Cuda版本的问题，所以就重新安装了12.4。然而，只有发生上述错误的集合（640 -> 721）发生了变化，即使再试，错误也会发生在集合 721 中。

我在调试模式下停在发生错误的地方，检查整个代码中是否存在内存泄漏。
然而，我们能够确认在停止之前内存没有持续增加，如下图所示。

//环境
RTX3090、Visual Studio 2022、Windows 10、Cuda 12.4v

下面是我正在使用的代码的袖口部分。

#define CHECK_CUFFT(cond) check_cufft(cond, __LINE__)
inline void check_cufft(cufftResult err, const int line)
{
    if( CUFFT_SUCCESS != err) {
    fprintf(stderr, "CUFFT error line %d\n %s\nerror %d: %s\nterminating!\n", __LINE__,err, \
                                _cudaGetErrorEnum(err)); \
    cudaDeviceReset(); assert(0); \
}
...

// At Main Code, For Loop below.
int nFFTbatch = 10; //(Random between 1~100)
int nEstCompen[2] = { 360 , 90 };
int nEstCompenNum = 90 * 360;
cufftHandle planPreCompen;
cufftRes = cufftPlanMany(&planPreCompen, 2, nEstCompen, nEstCompen, 1, nEstCompenNum, nEstCompen, 1, nEstCompenNum, CUFFT_C2C, nFFTbatch );
cudaStatus = CHECK_CUFFT(cufftRes); if (cudaStatus != TRUE) return FALSE; // Here gets error 'CUFFT_INTERNAL_ERROR'
cufftRes = cufftExecC2C(planPreCompen, mCompFFT, mCompFFT, CUFFT_FORWARD);
cudaStatus = CHECK_CUFFT(cufftRes); if (cudaStatus != TRUE) return FALSE;
cufftRes = cufftDestroy(planPreCompen);
cudaStatus = CHECK_CUFFT(cufftRes); if (cudaStatus != TRUE) return FALSE;

Answer 1

@罗伯特·克罗维拉
我找到了原因。这只是由 Thrust::sort 引起的，它放置在袖口之前并添加了最新的代码。
但仍然很好奇我用“cudaGetLastError()”检查了错误。
当分析问题是什么时我会更新这个。

#define CHECK_CUFFT(cond) check_cufft(cond, __LINE__)
inline void check_cufft(cufftResult err, const int line)
{
    if( CUFFT_SUCCESS != err) {
    fprintf(stderr, "CUFFT error line %d\n %s\nerror %d: %s\nterminating!\n", __LINE__,err, \
                                _cudaGetErrorEnum(err)); \
    cudaDeviceReset(); assert(0); \
}
struct CompPriority {
    double Mag;
    double Class;
    int Index; // Additional field for indexing
    int OriginIdx;
};

    CompPriority* d_compArray;
    cudaStatus = CHECK_CUDA(cudaMalloc((void**)&d_compArray, nMICD_Num * sizeof(CompPriority)), cu_dir); if (cudaStatus != TRUE) return FALSE;
    // Compen Result Get to Thrust vector
    getCompenPriorInfo << <nbsetPerDetect, threadsPerBlock, 0, streams[nGpuIdx][nThreadIdx] >> > (resMicdFull_d, d_compArray, nMICD_Num, 1);
    cudaStatus = CHECK_CUDA(cudaGetLastError(), cu_dir); if (cudaStatus != TRUE) return FALSE;

    thrust::device_vector<CompPriority> compVector(d_compArray, d_compArray + nMICD_Num);
    thrust::sort(thrust::device, compVector.begin(), compVector.end(), CompCategoryAndMagComparator());
    // Assign indices based on MATLAB logic
    assignIndices << <(nMICD_Num + 255) / 256, 256 >> > (thrust::raw_pointer_cast(compVector.data()), nMICD_Num, MAX_COMPENTOTAL);
    cudaStatus = CHECK_CUDA(cudaGetLastError(), cu_dir); if (cudaStatus != TRUE) return FALSE;
    setCompenPriority << <nbsetPerDetect, threadsPerBlock, 0, streams[nGpuIdx][nThreadIdx] >> > (resMicdFull_d, thrust::raw_pointer_cast(compVector.data()), nMICD_Num, 1);
    cudaStatus = CHECK_CUDA(cudaGetLastError(), cu_dir); if (cudaStatus != TRUE) return FALSE;

当 cufftPlanMany

问题描述投票：0回答：1

1个回答

最新问题

当 cufftPlanMany

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1