Nsight Compute 无法配置 Waveglow(PyTorch 应用程序)。

问题描述 投票:0回答:1

我试着给你做个简介 https:/github.comNVIDIAwaveglow 通过该命令。

nv-nsight-cu-cli --export ./nsight_output ~/.virtualenvs/waveglow/bin/python3 inference.py -f <(ls mel_spectrograms/*.pt) -w waveglow_256channels.pt -o . --is_fp16 -s 0.6

Python命令是来自指令 https:/github.comNVIDIAwaveglow#generateaudio-withour-pre-existing-model。 而且它的工作原理是Nsight System,而不是Nsight Compute。

Profiling没有结束打印这个日志;所以我按了Ctrl+C.另外,它只配置文件一个内核,但我有更多的内核。(由Nsight系统检查)

...
==PROF== Profiling "weight_norm_fwd_first_dim_ker..." - 286: 0%....50%....100% - 48 passes
==PROF== Profiling "weight_norm_fwd_first_dim_ker..." - 287: 0%....50%....100% - 48 passes
==PROF== Profiling "weight_norm_fwd_first_dim_ker..." - 288: 0%....50%....100% - 48 passes
==PROF== Profiling "weight_norm_fwd_first_dim_ker..." - 289: 0%....50%....100% - 48 passes
==PROF== Profiling "weight_norm_fwd_first_dim_ker..." - 290: 0%....50%....100% - 48 passes
==PROF== Profiling "weight_norm_fwd_first_dim_ker..." - 291: 0%....50%....100% - 48 passes
==PROF== Profiling "weight_norm_fwd_first_dim_ker..." - 292: 0%....50%....100% - 48 passes
==PROF== Profiling "weight_norm_fwd_first_dim_ker..." - 293: 0%....50%....100% - 48 passes
==PROF== Profiling "weight_norm_fwd_first_dim_ker..." - 294: 0%....50%....100% - 48 passes
==PROF== Profiling "weight_norm_fwd_first_dim_ker..." - 295: 0%....50%....100% - 48 passes
==PROF== Profiling "weight_norm_fwd_first_dim_ker..." - 296: 0%....50%...^C
==PROF== Received signal, trying to shutdown target application
 - 43 passes
==ERROR== Failed to profile kernel "weight_norm_fwd_first_dim_ker..." in process
==ERROR== An error occurred while trying to profile.
==ERROR== An error occurred while trying to profile
==PROF== Report: nsight_compute_result.nsight-cuprof-report

操作系统:CentOS Linux 7,Nsight Compute(2019.3.1,Build 26317742),GPU。Tesla V100-PCIE-32GB。

如何解决这个问题?

pytorch nsight
1个回答
1
投票

我不认为这里有任何错误,这个工具的表现符合预期。它并不是只剖析了一个内核,它在你的日志输出中已经剖析了296个内核的启动(看起来都是来自一个内核函数)。

你可以使用--launch-count或--kernel-regex选项来控制被剖析的内核的数量或类型。你也可以使用--metrics和--section来控制为每个内核收集的指标,因为收集较少的指标可以减少工具的开销。

参见 https:/docs.nvidia.comnsight-computeNsightComputeCliindex.html#命令行选项。 以获取更多可用的命令行选项。

© www.soinside.com 2019 - 2024. All rights reserved.