当使用与处理器关联的不同工作负载时，MPI 会在发送后阻止执行

Question

我在使用 MPI 代码时遇到一些问题（由我编写，用于测试另一个程序，其中不同的工作负载与不同的处理器相关联）。问题是，当我使用的处理器数量不同于 1 或 arraySize（在本例中为 4）时，程序在 MPI_Send 期间被阻塞，特别是当我运行时

mpirun -np 2 MPItest

程序在调用过程中被阻塞。我现在没有使用任何调试器，我只是想了解为什么它适用于 1 和 4 个处理器，但不适用于 2 个处理器（每个处理器阵列中有 2 个点），代码如下：

#include <mpi.h>
#include <iostream>

int main(int argc, char** argv) {
    int rank, size;
    const int arraySize = 4;
    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    // every processor have a different workload (1 or more spots on the array to send to the other processors)
    // every processor sends to every other processor its designated spots


    int* sendbuf = new int[arraySize];
    int* recvbuf = new int[arraySize];

    int istart = arraySize/size * rank;
    int istop = (rank == size) ? arraySize : istart + arraySize/size;

    for (int i = istart; i < istop; i++) {
        sendbuf[i] = i;
    }

    std::cout << "Rank " << rank << " sendbuf :" << std::endl;
    //print the sendbuf before receiving its other values
    for (int i = 0; i < arraySize; i++) {
        std::cout << sendbuf[i] << ", ";
    }
    std::cout << std::endl;

    // sending designated spots of sendbuf to other processors
    for(int i = istart; i < istop; i++){
        for(int j = 0; j < size; j++){
            MPI_Send(&sendbuf[i], 1, MPI_INT, j, i, MPI_COMM_WORLD);
        }
    }

    // receiving the full array
    for(int i = 0; i < arraySize ; i++){
        int recvRank = i/(arraySize/size);
        MPI_Recv(&recvbuf[i], 1, MPI_INT, recvRank, i, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
    }


    // print the recvbuf after receiving its other values
    std::cout << "Rank " << rank << " recvbuf :" << std::endl;
    for (int i = 0; i < arraySize; i++) {
        std::cout << recvbuf[i] << ", ";
    }
    std::cout << std::endl;

    delete[] sendbuf;
    delete[] recvbuf;

    MPI_Finalize();
    return 0;
}

我使用标签来区分数组中的不同点（也许这就是问题所在？）

我尝试使用不同数量的处理器，使用 1 个处理器时程序可以运行，使用 4 个处理器时程序也可以运行，使用 3 个处理器时程序会崩溃，使用 2 个处理器时程序会被阻止。我也尝试过使用 MPI_Isend 但它也不起作用（标志为 0），使用 MPI_Isend 修改后的代码如下：

#include <mpi.h>
#include <iostream>

int main(int argc, char** argv) {
    int rank, size;
    const int arraySize = 4;
    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    // every processor have a different workload (1 or more spots on the array to send to the other processors)
    // every processor sends to every other processor its designated spots


    int* sendbuf = new int[arraySize];
    int* recvbuf = new int[arraySize];

    int istart = arraySize/size * rank;
    int istop = (rank == size) ? arraySize : istart + arraySize/size;

    for (int i = istart; i < istop; i++) {
        sendbuf[i] = i;
    }

    std::cout << "Rank " << rank << " sendbuf :" << std::endl;
    //print the sendbuf before receiving its other values
    for (int i = 0; i < arraySize; i++) {
        std::cout << sendbuf[i] << ", ";
    }
    std::cout << std::endl;

    // sending designated spots of sendbuf to other processors
    for(int i = istart; i < istop; i++){
        for(int j = 0; j < size; j++){
            MPI_Request request;
            //MPI_Send(&sendbuf[i], 1, MPI_INT, j, i, MPI_COMM_WORLD);
            MPI_Isend(&sendbuf[i], 1, MPI_INT, j, i, MPI_COMM_WORLD, &request);
            // control if the send is completed
            int flag = 0;
            MPI_Test(&request, &flag, MPI_STATUS_IGNORE);
            const int numberOfRetries = 10;
            if(flag == 0){ // operation not completed
                std::cerr << "Error in sending, waiting" << std::endl;
                for(int k = 0; k < numberOfRetries; k++){
                    MPI_Test(&request, &flag, MPI_STATUS_IGNORE);
                    if(flag == 1){
                        break;
                    }
                }
                if(flag == 0){
                    std::cerr << "Error in sending, aborting" << std::endl;
                    MPI_Abort(MPI_COMM_WORLD, 1);
                }
                
            }
        }
    }

    // receiving the full array
    for(int i = 0; i < arraySize ; i++){
        int recvRank = i/(arraySize/size);
        MPI_Recv(&recvbuf[i], 1, MPI_INT, recvRank, i, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
    }


    // print the recvbuf after receiving its other values
    std::cout << "Rank " << rank << " recvbuf :" << std::endl;
    for (int i = 0; i < arraySize; i++) {
        std::cout << recvbuf[i] << ", ";
    }
    std::cout << std::endl;

  
    //MPI_Alltoall(sendbuf, 1, MPI_INT, recvbuf, 1, MPI_INT, MPI_COMM_WORLD);

    delete[] sendbuf;
    delete[] recvbuf;

    MPI_Finalize();
    return 0;
}

使用此代码，-np 4也不起作用

Answer 1

由于我还没有收到该问题的任何答案，因此我想添加一些对我的问题的见解，以帮助一些发现自己处于相同情况的人。

我测试了另一个代码，看看 OpenMPI 标准在我的笔记本电脑上是否运行良好，因为有太多问题对于标准来说并没有错，甚至互联网上的代码示例在我的笔记本电脑上无法运行。我测试了以下代码，一个非常简单的代码，在两个进程之间发送数组的一部分：

#include <mpi.h>
#include <iostream>

int main(int argc, char** argv) {
    int rank, size;
    const int arraySize = 5;
    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    // initialize sendbuf
    int* sendbuf = new int[arraySize];
    for(int iteration = 0; iteration < 3; iteration++){

        if(rank){
            std::cout << "Rank " << rank << " sendbuf :" << std::endl;
            for (int i = 0; i < arraySize; i++) {
                std::cout << sendbuf[i] << ", ";
            }
            std::cout << std::endl;
        }

        // first process send first three elements to second process
        if(rank == 0){
            for(int i = 0; i < 3; i++){
                sendbuf[i] = i;
            }
            MPI_Send(&sendbuf[0], 3, MPI_INT, 1, 0, MPI_COMM_WORLD);
        } else {
            for(int i = 3; i < 5; i++){
                sendbuf[i] = i;
            }
        }

        // receive the full array with MPI_Wait
        if(rank){
            // second process receive the first three elements from first process
            MPI_Recv(&sendbuf[0], 3, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
        }

        // print the full array
        if(rank){
            std::cout << "Rank " << rank << " sendbuf after:" << std::endl;
            for (int i = 0; i < arraySize; i++) {
                std::cout << sendbuf[i] << ", ";
            }
            std::cout << std::endl;
        }

        // reset MPI requests and buffers
        for(int i = 0; i < arraySize; i++){
            sendbuf[i] = -1;
        }
        
    }

    MPI_Finalize();


}

我想看看单个发送和单个接收是否可以在我的笔记本电脑上循环工作，令我惊讶的是（经过两天的尝试），这是我的笔记本电脑和 OpenMPI 实现的问题。我在我拥有的集群以及 MPI 实现工作的地方测试了这段代码，看看这是否是我的硬件问题。该代码可以在集群上运行，但不能在我的笔记本电脑上运行。

总而言之，这是我拥有的硬件：

内核：6.6.1-arch1-1
架构：x86_64
位：64
编译器：gcc
型号：联想拯救者7 16IAX7
处理器：第12代Intel(R) Core(TM) i7-12800HX
OpenMPI 版本：4.1.5-5

这不是一个解决方案，但回答了我的问题：为什么代码不起作用。

当使用与处理器关联的不同工作负载时，MPI 会在发送后阻止执行

问题描述投票：0回答：1

1个回答

最新问题

当使用与处理器关联的不同工作负载时，MPI 会在发送后阻止执行

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1