Mpiexec 需要永远启动(可能是 MPIPoolExecutor 的问题)?

问题描述 投票:0回答:1

我正在尝试运行以下命令:

mpiexec -n 1 python scratch.py

其中 scrap.py 是提供的简单示例 here

from mpi4py.futures import MPIPoolExecutor
def square(i):
global initialized
try:
    initialized
except NameError:
    initialized = False
if not initialized:
    print("expensive initialization")
import time
time.sleep(2)
initialized = True
return i**2

if __name__ == '__main__':
    with MPIPoolExecutor(2) as ex:
        for result in ex.map(square, range(7)):
            print (result)

无限期地徘徊

ps 辅助 | grep mpi 给出:

username 16944 44.6  0.0 377280 14476 ?        Rl   09:46   0:03 python -R -m mpi4py.futures.server

所以,我知道问题出在 MPIPoolExecutor 内部。我还想知道是否存在防火墙限制,但是 systemctl status firewalld 给出了

firewalld.service - firewalld - dynamic firewall daemon Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled) Active: inactive (dead) Docs: man:firewalld(1) 
“python -R -m mpi4py.futures.server”到底做了什么以及为什么它需要很长时间?

我有;

  • Python3版本3.8.0
  • mpich版本4.2.0

使用科学Linux 7.4

CPU详细信息:

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 158
Model name:            Intel(R) Core(TM) i5-7400 CPU @ 3.00GHz
Stepping:              9
CPU MHz:               3300.000
CPU max MHz:           3500.0000
CPU min MHz:           800.0000
BogoMIPS:              6000.00
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              6144K
NUMA node0 CPU(s):     0-3

最后,单独运行命令 python3 -R -m mpi4py.futures.server 会产生以下错误:

 Traceback (most recent call last):
  File "/usr/local/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/local/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/people/hkaya/.local/lib/python3.8/site-packages/mpi4py/futures/server.py", line 14, in <module>
    main()
  File "/home/people/hkaya/.local/lib/python3.8/site-packages/mpi4py/futures/server.py", line 10, in main
    _lib.server_main()
  File "/home/people/hkaya/.local/lib/python3.8/site-packages/mpi4py/futures/_lib.py", line 1068, in server_main
    server_main_service()
  File "/home/people/hkaya/.local/lib/python3.8/site-packages/mpi4py/futures/_lib.py", line 1057, in server_main_service
    comm = server_accept(service, info)
  File "/home/people/hkaya/.local/lib/python3.8/site-packages/mpi4py/futures/_lib.py", line 1006, in server_accept
    MPI.Publish_name(service, port, info)
  File "mpi4py/MPI/Comm.pyx", line 2755, in mpi4py.MPI.Publish_name
mpi4py.MPI.Exception: MPI_ERR_INTERN: internal error
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
  Proc: [[32891,0],0]
  Errorcode: 1
mpi openmpi mpi4py mpich mpiexec
1个回答
0
投票

根据我的问题下评论的建议,我仔细检查并发现 mpi4py 确实使用与 mpiexec 链接到的库不同的库。因此,我使用 mpich 库卸载并重新安装了 mpi4py 使用此链接,现在 MPIPoolExecutor 运行没有任何问题。谢谢大家!

© www.soinside.com 2019 - 2024. All rights reserved.