Tensorboard未显示最后一个检查点的评估结果

问题描述 投票:2回答:1

我使用TensorFlow对象检测API训练了一些具有4K步骤自定义数据的对象检测模型,并在训练期间对它们进行了评估。对所有检查点进行了评估,我在控制台上查看了结果。

但是,我无法在Tensorboard上看到最后两个检查点的评估结果。它显示了3K步骤的评估结果,之后没有任何内容。我可以看到评估在控制台上完成,也在文件夹中完成。

启动Tensorboard时,控制台上没有错误消息。我可以看到培训结果完全上传到Tensorboard,唯一缺少的是最后的评估结果。

我再次尝试评估最新的检查点,但没有任何改变。在评估结束时,我收到一条消息,指出指标会记录到摘要中......

训练检查点每10分钟保存一次,评估需要12分钟。但即便在这种情况下,我希望最新的检查点的评估结果存在。

当我尝试从Tensorboard下载csv文件时,我也看不到最后两个检查点的评估。

可能是什么原因?

I0311 16:57:21.281645 MainThread program.py:165] Not bringing up TensorBoard, but inspecting event files.
I0311 16:57:21.281645 140028330256128 program.py:165] Not bringing up TensorBoard, but inspecting event files.
======================================================================
Processing event files... (this can take a few minutes)
======================================================================

Found event files in:
./CN_flow1_95/eval
./CN_flow1_95/train

These tags are in ./CN_flow1_95/eval:
audio -
histograms -
images
   image-0
   image-1
   image-2
   image-3
   image-4
   image-5
   image-6
   image-7
   image-8
   image-9
scalars
   Losses/Loss/BoxClassifierLoss/classification_loss
   Losses/Loss/BoxClassifierLoss/localization_loss
   Losses/Loss/RPNLoss/localization_loss
   Losses/Loss/RPNLoss/objectness_loss
   PascalBoxes_PerformanceByCategory/AP@0.5IOU/b'cyclist'
   PascalBoxes_PerformanceByCategory/AP@0.5IOU/b'motorcyclist'
   PascalBoxes_PerformanceByCategory/AP@0.5IOU/b'pedestrian'
   PascalBoxes_Precision/mAP@0.5IOU
tensor -
======================================================================

Event statistics for ./CN_flow1_95/eval:
audio -
graph
   first_step           0
   last_step            0
   max_step             0
   min_step             0
   num_steps            1
   outoforder_steps     []
histograms -
images
   first_step           0
   last_step            4112
   max_step             4112
   min_step             0
   num_steps            7
   outoforder_steps     []
scalars
   first_step           0
   last_step            4112
   max_step             4112
   min_step             0
   num_steps            7
   outoforder_steps     []
sessionlog:checkpoint -
sessionlog:start -
sessionlog:stop -
tensor -
======================================================================

These tags are in ./CN_flow1_95/train:
audio -
histograms
   ModelVars/...
images -
scalars
   Losses/TotalLoss
   Losses/clone_0/Loss/BoxClassifierLoss/classification_loss
   Losses/clone_0/Loss/BoxClassifierLoss/localization_loss
   Losses/clone_0/Loss/RPNLoss/localization_loss
   Losses/clone_0/Loss/RPNLoss/objectness_loss
   Losses/clone_1/Loss/BoxClassifierLoss/classification_loss
   Losses/clone_1/Loss/BoxClassifierLoss/localization_loss
   Losses/clone_1/Loss/RPNLoss/localization_loss
   Losses/clone_1/Loss/RPNLoss/objectness_loss
   Losses/clone_2/Loss/BoxClassifierLoss/classification_loss
   Losses/clone_2/Loss/BoxClassifierLoss/localization_loss
   Losses/clone_2/Loss/RPNLoss/localization_loss
   Losses/clone_2/Loss/RPNLoss/objectness_loss
   batch/fraction_of_150_full
   clone_0/Losses/clone_0//clone_loss
   global_step/sec
   queue/prefetch_queue/fraction_of_5_full
tensor -
======================================================================

Event statistics for ./CN_flow1_95/train:
audio -
graph
   first_step           0
   last_step            0
   max_step             0
   min_step             0
   num_steps            1
   outoforder_steps     []
histograms
   first_step           0
   last_step            4110
   max_step             4110
   min_step             0
   num_steps            28
   outoforder_steps     []
images -
scalars
   first_step           0
   last_step            4110
   max_step             4110
   min_step             0
   num_steps            54
   outoforder_steps     []
sessionlog:checkpoint
   first_step           1
   last_step            4111
   max_step             4111
   min_step             1
   num_steps            7
   outoforder_steps     []
sessionlog:start
   outoforder_steps     []
   steps                [0, 4110]
sessionlog:stop
   outoforder_steps     []
   steps                [0, 0]
tensor -
======================================================================
tensorflow tensorboard object-detection-api
1个回答
0
投票

我也在TensorBoard回购上问了这个问题。他们说没有理由不完全加载事件文件,并告诉我来这里......

有时会看到正确的结果(如果因为详尽的测试而有10-15个事件文件),但大多数情况下他们不能。我改变了存储检查点的频率,以便在评估过程中不遗漏任何检查点(没有意义,但仍然尝试过)

我每12分钟存储一次检查点,因为评估也需要12分钟。它也没用。

所有张量板 - 检查结果看起来都很好。

我在不同的计算机上尝试了不同的模型,我也清理了浏览器缓存。什么都没有帮助真的。

我相信张量板中有一个错误。

热门问题
推荐问题
最新问题