除了来自单元格的原始输出之外,我还想花费在单元格执行上花费的时间。
为此,我尝试了%%timeit -r1 -n1
,但它没有公开在单元格中定义的变量。
%%time
适用于仅包含1个语句的单元格。
In[1]: %%time
1
CPU times: user 4 µs, sys: 0 ns, total: 4 µs
Wall time: 5.96 µs
Out[1]: 1
In[2]: %%time
# Notice there is no out result in this case.
x = 1
x
CPU times: user 3 µs, sys: 0 ns, total: 3 µs
Wall time: 5.96 µs
最好的方法是什么?
我一直在使用Execute Time in Nbextension很长一段时间了。太棒了。
在Phillip Cloud的github上使用cell magic和这个项目:
如果您希望在默认情况下加载它,请将其放在笔记本顶部或将其放入配置文件中来加载它:
%install_ext https://raw.github.com/cpcloud/ipython-autotime/master/autotime.py
%load_ext autotime
如果加载,后续单元格执行的每个输出将包括执行它所花费的时间(分钟和秒)。
你可以使用timeit
魔术功能。
%timeit CODE_LINE
或者在细胞上
%%timeit SOME_CELL_CODE
在https://nbviewer.jupyter.org/github/ipython/ipython/blob/1.x/examples/notebooks/Cell%20Magics.ipynb上查看更多IPython魔术函数
遇到麻烦时意味着什么:
?%timeit
或??timeit
要了解详细信息:
Usage, in line mode:
%timeit [-n<N> -r<R> [-t|-c] -q -p<P> -o] statement
or in cell mode:
%%timeit [-n<N> -r<R> [-t|-c] -q -p<P> -o] setup_code
code
code...
Time execution of a Python statement or expression using the timeit
module. This function can be used both as a line and cell magic:
- In line mode you can time a single-line statement (though multiple
ones can be chained with using semicolons).
- In cell mode, the statement in the first line is used as setup code
(executed but not timed) and the body of the cell is timed. The cell
body has access to any variables created in the setup code.
我发现克服这个问题的唯一方法是使用print执行最后一个语句。
Do not forget that细胞魔法开始于%%
,线魔法开始于%
。
%%time
clf = tree.DecisionTreeRegressor().fit(X_train, y_train)
res = clf.predict(X_test)
print(res)
%time
和%timeit
现在成为ipython的内置magic commands的一部分
更简单的方法是在jupyter_contrib_nbextensions包中使用ExecuteTime插件。
pip install jupyter_contrib_nbextensions
jupyter contrib nbextension install --user
jupyter nbextension enable execute_time/ExecuteTime
我只是在单元格的开头添加了%%time
并得到了时间。您可以在Jupyter Spark集群/虚拟环境中使用相同的内容。只需在单元格顶部添加%%time
即可获得输出。在使用Jupyter的spark集群上,我添加到单元格的顶部,我得到如下输出: -
[1] %%time
import pandas as pd
from pyspark.ml import Pipeline
from pyspark.ml.classification import LogisticRegression
import numpy as np
.... code ....
Output :-
CPU times: user 59.8 s, sys: 4.97 s, total: 1min 4s
Wall time: 1min 18s
这不是很漂亮,但没有额外的软件
class timeit():
from datetime import datetime
def __enter__(self):
self.tic = self.datetime.now()
def __exit__(self, *args, **kwargs):
print('runtime: {}'.format(self.datetime.now() - self.tic))
然后你可以运行它:
with timeit():
# your code, e.g.,
print(sum(range(int(1e7))))
% 49999995000000
% runtime: 0:00:00.338492
import time
start = time.time()
"the code you want to test stays here"
end = time.time()
print(end - start)
你可能还想查看python的profiling magic命令%prun
which给出类似的东西 -
def sum_of_lists(N):
total = 0
for i in range(5):
L = [j ^ (j >> i) for j in range(N)]
total += sum(L)
return total
然后
%prun sum_of_lists(1000000)
将返回
14 function calls in 0.714 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
5 0.599 0.120 0.599 0.120 <ipython-input-19>:4(<listcomp>)
5 0.064 0.013 0.064 0.013 {built-in method sum}
1 0.036 0.036 0.699 0.699 <ipython-input-19>:1(sum_of_lists)
1 0.014 0.014 0.714 0.714 <string>:1(<module>)
1 0.000 0.000 0.714 0.714 {built-in method exec}
我发现在处理大块代码时它很有用。