有没有办法让长for循环运行得更快?

问题描述 投票:0回答:1

我正在制作一个简单的蒙特卡洛模拟器,它接受 3*4 的概率矩阵以及要模拟的迭代次数。输出是包含所有结果的表格。 表的每一行都是一个列表,其中包含:[iter no, random no, result1, result2] 逻辑很简单,只是生成一个随机数并与累积概率进行比较。因此,当我尝试 1000000 次迭代时,需要 16 秒或更长时间,我试图找出是否有可能获得更好的时间。 到目前为止我已经尝试过:

  • 使用记忆,我想这没有意义,因为返回数据取决于随机数。
  • 使用 numpy 数组和 cumsum() 方法获取累积概率。
  • 我能想到的最好的数据结构是可能结果的元组列表和存储概率的二维列表。
from random import random
import numpy as np

matrix = [
            [0.2, 0.2, 0.05, 0.1],
            [0.7, 0.5, 0.6, 0.4],
            [0.2, 0.25, 0.2, 0.04],
        ]
estados = [
    ("buena", "buena"),
    ("buena", "regular"),
    ("buena", "critica"),
    ("buena", "alta"),
    ("regular", "buena"),
    ("regular", "regular"),
    ("regular", "critica"),
    ("regular", "alta"),
    ("critica", "buena"),
    ("critica", "regular"),
    ("critica", "critica"),
    ("critica", "alta"),
]
estado_siguiente = {"buena": 0, "regular": 1, "critica": 2, "alta": 3}

def get_result(rnd, probabilities, sig_estado=None):
    probabilities= (
        probabilities[0]
        if (sig_estado == "" or sig_estado == "alta")
        else probabilities[estado_siguiente[sig_estado] + 1]
    )

    for i in range(len(probabilities[0])):
        if rnd < probabilities[0][i]:
            return probabilities[1][i]
    return probabilities[1][len(probabilities) - 1]

def start_simulation(probabilities, iter):
    vector_estado = [0, 0, "", "", 0, ""]
    table_final = list()

    #Here i have 4 accumulators for the probabilities of the 3 rows plus one for the whole table because a new result1 depends on the previous result2 meaning sometimes i just need the probs of just a row
    to_np_arr = np.array(probabilities)
    all_acc = (np.cumsum(to_np_arr), estados[:])
    buena_acc = (all_acc[0][0:4], estados[0:4])
    regular_acc = (all_acc[0][4:8], estados[4:8])
    critico_acc = (all_acc[0][8:], estados[8:])

    
    for _ in range(iter):
        rnd1 = random()
        estado1, estado2 = definir_condicion(
            rnd1,
            probabilities=[all_acc, buena_acc, regular_acc, critico_acc],
            sig_estado=vector_estado[3],
        )
        
        #Add the list to the rest of the rows
        tabla_final.append(vector_estado)

        #Here I replace the old list with the new one
        vector_estado = [
            vector_estado[0] + 1,
            "%.3f" % (rnd1),
            estado1,
            estado2,
            "%.3f" % (rnd2) if rnd2 else "-",
            condicion_alta,
        ]
    return tabla_final

python list numpy loops
1个回答
0
投票

如果您有多核处理器,则可以使用并行处理库(例如 Python 中的 multiprocessing)将工作分配到多个内核。

您还可以尝试使用分析工具来识别代码中的瓶颈。 Python 的 cProfile 模块可以帮助您了解大部分时间都花在哪里了。

© www.soinside.com 2019 - 2024. All rights reserved.