将pipeline_pb2.TrainEvalPipelineConfig转换为用于张量流对象检测API的JSON或YAML文件

问题描述 投票:0回答:1

我想将pipeline_pb2.TrainEvalPipelineConfig转换为用于张量流对象检测API的JSON或YAML文件格式。我尝试使用以下方法转换protobuf文件:

import tensorflow as tf
from google.protobuf import text_format
import yaml

from object_detection.protos import pipeline_pb2

def get_configs_from_pipeline_file(pipeline_config_path, config_override=None):

  '''
  read .config and convert it to proto_buffer_object
  '''

  pipeline_config = pipeline_pb2.TrainEvalPipelineConfig()
  with tf.gfile.GFile(pipeline_config_path, "r") as f:
    proto_str = f.read()
    text_format.Merge(proto_str, pipeline_config)
  if config_override:
    text_format.Merge(config_override, pipeline_config)
  #print(pipeline_config)
  return pipeline_config


def create_configs_from_pipeline_proto(pipeline_config):
  '''
  Returns the configurations as dictionary
  '''

  configs = {}
  configs["model"] = pipeline_config.model
  configs["train_config"] = pipeline_config.train_config
  configs["train_input_config"] = pipeline_config.train_input_reader
  configs["eval_config"] = pipeline_config.eval_config
  configs["eval_input_configs"] = pipeline_config.eval_input_reader
  # Keeps eval_input_config only for backwards compatibility. All clients should
  # read eval_input_configs instead.
  if configs["eval_input_configs"]:
    configs["eval_input_config"] = configs["eval_input_configs"][0]
  if pipeline_config.HasField("graph_rewriter"):
    configs["graph_rewriter_config"] = pipeline_config.graph_rewriter

  return configs


configs = get_configs_from_pipeline_file('pipeline.config')
config_as_dict = create_configs_from_pipeline_proto(configs)

但当我尝试用yaml.dump(config_as_dict)将这个返回的字典转换为YAML时,它说

TypeError: can't pickle google.protobuf.pyext._message.RepeatedCompositeContainer objects

对于json.dump(config_as_dict),它说:

Traceback (most recent call last):
  File "config_file_parsing.py", line 48, in <module>
    config_as_json = json.dumps(config_as_dict)
  File "/usr/lib/python3.5/json/__init__.py", line 230, in dumps
    return _default_encoder.encode(obj)
  File "/usr/lib/python3.5/json/encoder.py", line 198, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python3.5/json/encoder.py", line 256, in iterencode
    return _iterencode(o, 0)
  File "/usr/lib/python3.5/json/encoder.py", line 179, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: label_map_path: "label_map.pbtxt"
shuffle: true
tf_record_input_reader {
  input_path: "dataset.record"
}
 is not JSON serializable

非常感谢这里的一些帮助。

json dictionary tensorflow yaml object-detection-api
1个回答
0
投票

JSON只能转储python primtivies原语和dict和list集合的子集(限制自引用)。

YAML功能更强大,可用于转储任意Python对象。但只有在转储的表示阶段可以“调查”这些对象时,这实质上将其限制为纯Python类的实例。对于在C级创建的对象,可以创建显式转储器,如果不可用,Python将尝试使用pickle协议将数据转储到YAML。

在PyPI上检查protobuf告诉我,有非通用的轮子可用,这总是一些C代码优化的指示。检查其中一个文件确实显示了预编译的共享对象。

虽然您从配置中创建了一个字典,但是当这个字典可以被转储时,它的所有键及其所有值都可以被转储。由于您的键是字符串(JSON所必需的),您需要查看每个值,找到不转储的值,并将其转换为可转储的对象结构(dict / list for JSON,pure Python class for YAML)。

你可能想看一下Module json_format

© www.soinside.com 2019 - 2024. All rights reserved.