我想跟踪我从 Python 脚本执行的
process
(生物信息工具)的运行时间和内存使用情况。我在Unix集群上运行该进程,并将监控参数保存在report_file.txt
中。为了测量经过的时间,我使用 resources
库,并使用 psutil
库来监视内存使用情况。
我的主要目标是比较不同工具的性能,所以我不想以任何方式限制内存或时间。
import sys
import os
import subprocess, resource
import psutil
import time
def get_memory_info():
return {
"total_memory": psutil.virtual_memory().total / (1024.0 ** 3),
"available_memory": psutil.virtual_memory().available / (1024.0 ** 3),
"used_memory": psutil.virtual_memory().used / (1024.0 ** 3),
"memory_percentage": psutil.virtual_memory().percent
}
# Open file to capture process parameters
outrepfp = open(tbl_rep_file, "w");
### Start measuring the process parameters
SLICE_IN_SECONDS = 1
# Start measuring time
usage_start = resource.getrusage(resource.RUSAGE_CHILDREN)
# Create the line for process execution
cmd = '{0} {1} --tblout {2} {3}'.format(bioinformatics_tool, setups, resultdir, inputs)
# Execute the process
r = subprocess.Popen(cmd.split(), stdout=subprocess.DEVNULL, stderr=subprocess.PIPE, encoding='utf-8')
# End measuring time
usage_end = resource.getrusage(resource.RUSAGE_CHILDREN) # end measuring resources
# Save memory measures
resultTable = []
while r.poll() == None:
resultTable.append(get_memory_info())
time.sleep(SLICE_IN_SECONDS)
# In case the process fails
if r.returncode: sys.exit('FAILED: {}\n{}'.format(cmd, r.stderr))
# Extract used memory
memory = [m['used_memory'] for m in resultTable]
# Count the elapsed time
cpu_time_user = usage_end.ru_utime - usage_start.ru_utime
cpu_time_system = usage_end.ru_stime - usage_start.ru_stime
# Write measurment to report_file.txt
outrepfp.write('{0} {1} {2} {3}\n'.format(bioinformatics_tool, cpu_time_user, cpu_time_system, memory))
对于给定的流程,我收到了我的
report_file.txt
:
生物信息学工具 0.0 0.0 [48.16242980957031, 47.76295852661133]
您能否帮我理解为什么尽管内存使用情况被监控了 2 秒并捕获了两个值,但经过的时间却显示为 0?
之前,我实现了一种时间捕获机制,该机制报告同一进程大约 4 秒的运行时间,这似乎与我当前的内存使用测量不一致。
如果目标是测量由
subprocess.Popen
启动的进程的运行时间,那么 usage_end = resource.getrusage(resource.RUSAGE_CHILDREN)
可能应该在轮询进程终止的循环之后。