我有一个非常简单的多处理功能,可以运行虚拟工作负载。我用imap_unordered
的不同块大小和multiprocessing.Pool
的不同进程号多次运行它。
问题是输出显示我无论通过多少processes
或chunksizes
,imap_unordered
都只将整个列表传递给1
进程。
预期的输出是将列表分成多个块,将每个块分配给每个进程,我将看到该进程正在接收不同大小的列表。
我在这里想念东西吗?
我有以下代码和输出:
import multiprocessing
def run_task(iterable):
# Task to be executed, simulating dummy work
work_simul = 0
for number in iterable:
work_simul += number * number
return (len(iterable))
def test_run(proc, chunksize):
# runs the function "run_task" with multiprocessing
# defining our dummy iterable of length 10000
iterable = [i**i for i in range(5000)]
original_size = len(iterable) # Size of the iterable for comparison
results = {}
with multiprocessing.Pool(processes=proc) as p:
for process, r_value in \
enumerate(p.imap_unordered(run_task, (iterable,),
chunksize=chunksize)):
# Add our process number and its return value into results so that we can compare performance here.
results[process + 1] = r_value
print(
f"""Original size: {original_size}
Total process # {proc}\nChunksize # {chunksize}""")
for key in results.keys():
print(f"Process # {key}: has list length {results[key]}\n\n")
if __name__ == "__main__":
test_run(1, 10)
test_run(5, 10)
test_run(10, 10)
test_run(1, 100)
test_run(5, 100)
test_run(10, 100)
输出:
Original size: 5000
Total process # 1
Chunksize # 10
Process # 1: has list length 5000
Original size: 5000
Total process # 5
Chunksize # 10
Process # 1: has list length 5000
Original size: 5000
Total process # 10
Chunksize # 10
Process # 1: has list length 5000
Original size: 5000
Total process # 1
Chunksize # 100
Process # 1: has list length 5000
Original size: 5000
Total process # 5
Chunksize # 100
Process # 1: has list length 5000
Original size: 5000
Total process # 10
Chunksize # 100
Process # 1: has list length 5000