我尝试了以下代码,但代码之间的性能差异很大。我听说顶级代码不适合数值计算,但性能似乎也取决于顶级变量(此处为N)是否出现在for循环的范围内。避免这种顶级变量总是更好吗?
N = 10000000
# case1 (slow)
x = 0.0
@time for k = 1:N
x += float( k )
end
# case2 (slow)
@time let
y = 0.0
for j = 1:N
y += float( j )
end
end
# case3 (very fast)
@time let
n::Int64
n = N
z = 0.0
for m = 1:n
z += float( m )
end
end
# case 4 (slow)
function func1()
c = 0.0
for i = 1:N
c += float( i )
end
end
# case 5 (fast)
function func2( n )
c = 0.0
for i = 1:n
c += float( i )
end
end
# case 6 (fast)
function func3()
n::Int
n = N
c = 0.0
for i = 1:n
c += float( i )
end
end
# case 7 (slow)
function func4()
n = N # n = int( N ) is also slow
c = 0.0
for i = 1:n
c += float( i )
end
end
@time func1()
@time func2( N )
@time func3()
@time func4()
使用Julia 0.3.7(在Linux x86_64上)获得的结果是
elapsed time: 2.595440598 seconds (959985496 bytes allocated, 10.70% gc time)
elapsed time: 2.469471127 seconds (959983688 bytes allocated, 11.49% gc time)
elapsed time: 1.608e-6 seconds (16 bytes allocated)
elapsed time: 2.535243279 seconds (960021976 bytes allocated, 11.21% gc time)
elapsed time: 0.002601149 seconds (75592 bytes allocated)
elapsed time: 0.003471583 seconds (84456 bytes allocated)
elapsed time: 2.480343146 seconds (960020752 bytes allocated, 11.48% gc time)
答案是“避免这种顶级变量总是更好吗?”文字当然是“不,它取决于”,但一个有用的评论是将全局变量声明为常量
const N = 10000000
使案例2与案例3一样快。
编辑:我应该补充一点,案例2的问题是顶级N
使得范围1:N
和迭代器变量j
的类型不稳定,即使累加器变量y
是本地的。针对此问题的更灵活的解决方法是
let
y = 0.0
for j = 1:(N::Int)
y += float( j )
end
y
end