MPI_ERR_TRUNCATE:广播中

问题描述 投票:5回答:1

我有一个int,我打算从根(rank==(FIELD=0))广播。

int winner

if (rank == FIELD) {
    winner = something;
}

MPI_Barrier(MPI_COMM_WORLD);
MPI_Bcast(&winner, 1, MPI_INT, FIELD, MPI_COMM_WORLD);
MPI_Barrier(MPI_COMM_WORLD);
if (rank != FIELD) {
    cout << rank << " informed that winner is " << winner << endl;
}

但看来我得到了

[JM:6892] *** An error occurred in MPI_Bcast
[JM:6892] *** on communicator MPI_COMM_WORLD
[JM:6892] *** MPI_ERR_TRUNCATE: message truncated
[JM:6892] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort

发现我可以增加Bcast中的缓冲区大小

MPI_Bcast(&winner, NUMPROCS, MPI_INT, FIELD, MPI_COMM_WORLD);

NUMPROCS是正在运行的进程数。 (实际上似乎我只需要将其设为2)即可。然后运行,但是给出了意外的输出...

1 informed that winner is 103
2 informed that winner is 103
3 informed that winner is 103
5 informed that winner is 103
4 informed that winner is 103

[当我coutwinner时,应为-1

c++ mpi broadcast openmpi
1个回答
10
投票

您的代码开头有错误:

if (rank == FIELD) {
   // randomly place ball, then broadcast to players
   ballPos[0] = rand() % 128;
   ballPos[1] = rand() % 64;
   MPI_Bcast(ballPos, 2, MPI_INT, FIELD, MPI_COMM_WORLD);
}

这是一个非常常见的错误。 MPI_Bcast是一个集体操作,必须由所有进程调用才能完成。在您的情况下,该广播不会被MPI_COMM_WORLD中的所有进程调用(而是仅由根目录调用),因此会干扰下一个广播操作,即循环内的广播操作。实际上,第二个广播操作将第一个消息(两个int元素)发送的消息接收到缓冲区中,该缓冲区仅包含一个int,因此会出现截断错误消息。在Open MPI中,每个广播在内部使用相同的消息标签值,因此不同的广播可能会相互干扰,而不会按顺序发布。这符合(旧的)MPI标准-在MPI-2.2中,一个不能有多个未完成的集体操作(在MPI-3.0中,一个人可以具有多个未完成的[[non-blocking]]集体操作)。您应该将代码重写为:if (rank == FIELD) { // randomly place ball, then broadcast to players ballPos[0] = rand() % 128; ballPos[1] = rand() % 64; } MPI_Bcast(ballPos, 2, MPI_INT, FIELD, MPI_COMM_WORLD);

© www.soinside.com 2019 - 2024. All rights reserved.