我正在尝试运行以下命令:
mpiexec -n 1 python scratch.py
其中 scrap.py 是提供的简单示例 here
from mpi4py.futures import MPIPoolExecutor
def square(i):
global initialized
try:
initialized
except NameError:
initialized = False
if not initialized:
print("expensive initialization")
import time
time.sleep(2)
initialized = True
return i**2
if __name__ == '__main__':
with MPIPoolExecutor(2) as ex:
for result in ex.map(square, range(7)):
print (result)
无限期地徘徊
ps 辅助 | grep mpi 给出:
username 16944 44.6 0.0 377280 14476 ? Rl 09:46 0:03 python -R -m mpi4py.futures.server
所以,我知道问题出在 MPIPoolExecutor 内部。我还想知道是否存在防火墙限制,但是 systemctl status firewalld 给出了
firewalld.service - firewalld - dynamic firewall daemon Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled) Active: inactive (dead) Docs: man:firewalld(1)
“python -R -m mpi4py.futures.server”到底做了什么以及为什么它需要很长时间?
我有;
使用科学Linux 7.4
CPU详细信息:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 158
Model name: Intel(R) Core(TM) i5-7400 CPU @ 3.00GHz
Stepping: 9
CPU MHz: 3300.000
CPU max MHz: 3500.0000
CPU min MHz: 800.0000
BogoMIPS: 6000.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 6144K
NUMA node0 CPU(s): 0-3
最后,单独运行命令 python3 -R -m mpi4py.futures.server 会产生以下错误:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/people/hkaya/.local/lib/python3.8/site-packages/mpi4py/futures/server.py", line 14, in <module>
main()
File "/home/people/hkaya/.local/lib/python3.8/site-packages/mpi4py/futures/server.py", line 10, in main
_lib.server_main()
File "/home/people/hkaya/.local/lib/python3.8/site-packages/mpi4py/futures/_lib.py", line 1068, in server_main
server_main_service()
File "/home/people/hkaya/.local/lib/python3.8/site-packages/mpi4py/futures/_lib.py", line 1057, in server_main_service
comm = server_accept(service, info)
File "/home/people/hkaya/.local/lib/python3.8/site-packages/mpi4py/futures/_lib.py", line 1006, in server_accept
MPI.Publish_name(service, port, info)
File "mpi4py/MPI/Comm.pyx", line 2755, in mpi4py.MPI.Publish_name
mpi4py.MPI.Exception: MPI_ERR_INTERN: internal error
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
Proc: [[32891,0],0]
Errorcode: 1
根据我的问题下评论的建议,我仔细检查并发现 mpi4py 确实使用与 mpiexec 链接到的库不同的库。因此,我使用 mpich 库卸载并重新安装了 mpi4py 使用此链接,现在 MPIPoolExecutor 运行没有任何问题。谢谢大家!