我一直在尝试优化我的代码。
我比较了4种可能的编码选择,以便在列表列表的一个单元格中获取值(或将列表替换为数组)。
M = 1000
my_list = [[] for i in range(M)]
for i in range(M):
for j in range(M):
my_list[i].append(0)
my_numpy_list = [ np.full(M,1) for i in range(M) ]
time1 = time.time()
for j in range(1000):
for i in range(10000):
my_list[0][0]
print( "1 ", time.time() - time1)
time1 = time.time()
for j in range(1000):
test_list = my_list[0]
for i in range(10000):
test_list[0]
print("2 ",time.time() - time1)
for j in range(1000):
for i in range(10000):
my_numpy_list[0][0]
print("3 ", time.time() - time1)
for j in range(1000):
my_numpy_test_list = my_numpy_list[0]
for i in range(10000):
my_numpy_test_list[0]
print( "4 ", time.time() - time1)
在我的计算机上,它给出以下时间:
1 0.9008669853210449
2 0.7616724967956543
3 2.9174351692199707
4 4.883266925811768
问题是,为什么访问numpy数组中的值的时间更长?如果更长,将数组转换为列表以更快地访问数据该怎么办。尤其令我惊讶的是,存储在列表中的数组(案例4)是最慢的情况。时间到了:
4 <2 <3 <1?
欢呼声
因为numpy的目标不是使您的数据访问更快。相反,numpy的目标是允许您编写向量化代码并避免循环。
让我们修改您的示例,并使您的代码在list / np.array的每个元素中加1”>
M = 1000 my_list = [[] for i in range(M)] for i in range(M): for j in range(M): my_list[i].append(0) my_numpy_array = np.array([ np.full(M,1) for i in range(M) ]) time1 = time.time() time1 = time.time() for j in range(1000): test_list = my_list[0] for i in range(10000): test_list[0]+1 print("list case addition",time.time() - time1) time2 = time.time() my_numpy_list = my_numpy_array+1 print("numpy case addition",time.time() - time2)
输出为:
list case addition 0.7961978912353516 numpy case addition 0.0031096935272216797
大约快250倍