mpiexec失败,因为MPI初始化中止

问题描述 投票:11回答:3

我正在尝试在Ubuntu 11.04(Natty Narwhal)上运行的64位计算机上安装MPICH 2。我用过

sudo apt-get install mpich2

首先,我很惊讶地发现未安装mpd。在Google上查找时,我看到Hydra是新的默认程序包管理器。所以我尝试运行我的MPI代码。我收到以下错误。

> -------------------------------------------------------------------------------------------
> [ip-10-99-75-58:02212] [[INVALID],INVALID] ORTE_ERROR_LOG: A
> system-required executable either could not be found or was not
> executable by this user in file
> ../../../../../../orte/mca/ess/singleton/ess_singleton_module.c at
> line 357 [ip-10-99-75-58:02212] [[INVALID],INVALID] ORTE_ERROR_LOG: A
> system-required executable either could not be found or was not
> executable by this user in file
> ../../../../../../orte/mca/ess/singleton/ess_singleton_module.c at
> line 230 [ip-10-99-75-58:02212] [[INVALID],INVALID] ORTE_ERROR_LOG: A
> system-required executable either could not be found or was not
> executable by this user in file ../../../orte/runtime/orte_init.c at
> line 132
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process
> is likely to abort.  There are many reasons that a parallel process
> can fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   orte_ess_set_name failed   --> Returned value A system-required
> executable either could not be found or was not executable by this
> user (-127) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> It looks like MPI_INIT failed for some reason; your parallel process
> is likely to abort.  There are many reasons that a parallel process
> can fail during MPI_INIT; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   ompi_mpi_init: orte_init failed   --> Returned "A system-required
> executable either could not be found or was not executable by this
> user" (-127) instead of "Success" (0)
> --------------------------------------------------------------------------
> *** The MPI_Init() function was called before MPI_INIT was invoked.
> *** This is disallowed by the MPI standard.
> *** Your MPI job will now abort.
> -------------------------------------------------------------------------------------------

首先,在我看来,它是Open MPI错误。但是我安装了MPICH 2而不是打开MPI。

其次,由于所有帮助似乎都针对Open MPI用户,因此我正在解决该问题。我想念什么吗?

mpich
3个回答
13
投票

我在Ubuntu 12.04上有同样的问题。我发现我的问题是因为我的计算机上同时装有open-mpi和mpich2。当我使用mpicc编译程序时,它将链接到open-mpi而不是mpich2。要解决此问题,可以使用“ mpicc.mpich2”编译程序,然后使用“ mpiexec.mpich2”执行代码。


2
投票

实际上,这些错误消息都是Open MPI错误。由于某种原因,您似乎在某处还安装了(错误配置的)Open MPI副本。您可以通过运行mpiexec来检查在键入which mpiexec时正在执行的特定文件。我相信您可以将其与以下结果进行比较:

dpkg --listfiles mpich2

(或类似名称),以便确定MPICH2软件包的安装位置。


0
投票

我遇到了这种情况,然后发现了问题。在启动过程中,系统上的某个位置LD_PRELOAD设置为指向OpenMPI中的libmpi.so。

示例:

export LD_PRELOAD=<some_directory>/openmpi/1.4.4/lib/libmpi.so

结果是MPICH2失败。只需在运行MPICH2之前先“取消设置LD_PRELOAD”,问题就会消失。

注意,有时有时需要将LD_PRELOAD设置为OpenMPI的libmpi.so,OpenMPI才能正常工作,因此取消设置可能会破坏您的OpenMPI。如果需要使用OpenMPI,请记住要重置它。


0
投票

MPI_INIT完成之前的本地中止操作已成功完成,但是无法汇总错误消息,并且不能保证所有其他进程都被杀死!

主作业正常终止,但是返回了1个进程非零退出代码。按照用户的指示,作业已中止。

© www.soinside.com 2019 - 2024. All rights reserved.