OpenCL - 内核未创建,返回 -45 代码

问题描述 投票:0回答:1

我使用 OpenCL 库在 OpenCL 中创建内核并在 C++ 中创建主机。不幸的是,尽管花了很长时间来解决这个问题,我还是收到以下错误:OpenCL 错误:clCreateKernel (-45)。为什么?我测试了示例项目以检查它是否不是硬件问题:https://gist.github.com/jarutis/a64eaa38c1caaf7bc3d28cea64bb8359它工作完美 - 我需要更改代码中的一件事。

const char* kernel2Greedy = R"(
__kernel void parallelGreedy2(__global long* denominators, int numDenominators, double numberToExchange, __global char* result) {
for (int i = 0; i < numDenominators - 1; i++) {
        for (int j = i + 1; j < numDenominators; j++) {
            if (denominators[i] < denominators[j]) {
                long temp = denominators[i];
                denominators[i] = denominators[j];
                denominators[j] = temp;
            }
        }
    }

    result[0] = '\0';

    for (int i = 0; i < numDenominators && numberToExchange > 0; i++) {
        if (numberToExchange >= denominators[i]) {
            int biggestNumberToExchangeInLoop = (int)(numberToExchange / denominators[i]);
            char temp[100];
            snprintf(temp, sizeof(temp), "%ld cash x%d\n", denominators[i], biggestNumberToExchangeInLoop);

            int offset = 0;
            while (result[offset] != '\0') {
                offset++;
            }

            for (int j = 0; temp[j] != '\0'; j++) {
                result[offset++] = temp[j];
            }
            result[offset] = '\0';

            numberToExchange = round(100 * (numberToExchange - (biggestNumberToExchangeInLoop * denominators[i]))) / 100.0;
        }
    }
}

)";
int main() {
    std::cout << "Type a number of instances you want to create:" << std::endl;
    int numberOfInstances2;
    std::cin >> numberOfInstances2;

    std::vector<long> denominators;
    std::cout << "How many numbers you would like to add?" << std::endl;
    int numberToAdd;
    std::cin >> numberToAdd;
    for (int i = 0; i < numberToAdd; i++) {
        std::cout << "Add next number:" << std::endl;
        long nextNumber;
        std::cin >> nextNumber;
        denominators.push_back(nextNumber);
    }

    // Setup OpenCL
    try {
        // device containers
        cl::Buffer bufferDenominators;
        cl::Buffer bufferResult;
        cl::Buffer bufferNumberToExchange;
        cl::Buffer bufferNumberOfDenominators;

        // create a context
        cl::Context context(CL_DEVICE_TYPE_GPU);
        // create a program
        cl::Program program(context, kernel2Greedy);

        // get the command queue
        cl::CommandQueue queue(context);

        // create the kernel
        cl::Kernel kernel(program, "parallelGreedy2");

        // copy the data to the device
        bufferDenominators = cl::Buffer(context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, sizeof(long) * denominators.size(), denominators.data());
        bufferResult = cl::Buffer(context, CL_MEM_WRITE_ONLY, sizeof(char) * 1000 * numberOfInstances2);
        bufferNumberToExchange = cl::Buffer(context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, sizeof(double), &numberToAdd);
        bufferNumberOfDenominators = cl::Buffer(context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, sizeof(int), &numberToAdd);

        // set kernel arguments
        kernel.setArg(0, bufferDenominators);
        kernel.setArg(1, bufferNumberOfDenominators);
        kernel.setArg(2, bufferNumberToExchange);
        kernel.setArg(3, bufferResult);

        // run the kernel
        queue.enqueueNDRangeKernel(kernel, cl::NullRange, cl::NDRange(numberOfInstances2), cl::NullRange);

        queue.finish();

        // copy the data back
        std::vector<char> result(numberOfInstances2 * 1000);
        queue.enqueueReadBuffer(bufferResult, CL_TRUE, 0, sizeof(char) * 1000 * numberOfInstances2, result.data());

        // Print results
        for (int i = 0; i < numberOfInstances2; ++i) {
            std::cout << "Result for instance " << i << ":" << std::endl;
            std::cout << std::string(result.begin() + (i * 1000), result.begin() + ((i + 1) * 1000)) << std::endl;
        }
    } catch (cl::Error& e) {
        std::cerr << "OpenCL error: " << e.what() << " (" << e.err() << ")" << std::endl;
        return 1;
    }

    return 0;
}

我仍然收到 OpenCL 错误:clCreateKernel (-45)。

算法本身工作正常(已检查)。

我想念什么?

parallel-processing opencl
1个回答
0
投票

Error

-45
表示 OpenCL C 代码尚未编译为可执行代码。事实上,缺少
program.build({ device }, "");
以及选择设备 (GPU) 的代码:

vector<cl::Device> cl_devices;
vector<cl::Platform> cl_platforms; // get all platforms (drivers)
cl::Platform::get(&cl_platforms);
for(int i=0; i<(int)cl_platforms.size(); i++) {
    vector<cl::Device> cl_devices_available;
    cl_platforms[i].getDevices(CL_DEVICE_TYPE_ALL, &cl_devices_available);
    for(int j=0; j<(int)cl_devices_available.size(); j++) {
        cl_devices.push_back(cl_devices_available[j]);
    }
}
cl::Device device = cl_devices[0]; // select device here

此外,

cl::Context context(device);
必须在您选择的特定设备上初始化,而不是在
CL_DEVICE_TYPE_GPU
设备类型上初始化。


帮自己一个忙,使用这个轻量级的 OpenCL-Wrapper,它消除了所有这些代码开销,并使 OpenCL 的开发变得更加简单。

© www.soinside.com 2019 - 2024. All rights reserved.