创建嵌套字符串列表,后跟嵌套 yaml 中的点

问题描述 投票:0回答:1

我在 yaml 中有一个输入,其中包含各种级别的嵌套对象。我需要一个 python 函数来遍历所有内容并获得所需的输出 - 字符串列表,其中每个字段如果嵌套则用点分隔 - Object1.Object2.Object3.Object4...下面的示例。

我试图用递归函数来实现它。我的代码片段:

tests = []
test2 = {}
def test(config, parent=None):
    previous_parent = None
    names = []
    for column in config:
    
        if column.get("dtype") in ["array", "struct"]:
            parent = column["name"]
            print(f"parent: {parent}")
            test(column["columns"], parent)
            
        else:
            value = column["name"]
            print(f"value: {value}")
            # names.append(value)

输出为:

value: PartitionDate
value: TransactionID
value: EventTimestamp
parent: ControlTransaction
value: StoreID
parent: RetailTransaction
value: StoreID
value: WorkstationID
...

输入:

columns:
- name: PartitionDate
- name: TransactionID
- name: EventTimestamp
- name: ControlTransaction
  dtype: struct
  columns:
    - name: StoreID
    - name: WorkstationID
    - name: Transaction
      dtype: struct
      columns:
        - name: TransactionID
    - name: TransactionNumber
- name: ControlType
- name: RetailTransaction
  dtype: struct
  columns:
    - name: StoreID
    - name: WorkstationID

输出:

[
PartitionDate,
TransactionID,
EventTimestamp,
ControlTransaction.StoreID,
ControlTransaction.WorkstationID,
ControlTransaction.Transaction.TransactionID,
ControlTransaction.TransactionNumber,
ControlType,
RetailTransaction.StoreID,
RetailTransaction.WorkstationID
]
python python-3.x nested yaml
1个回答
0
投票

只需一些更改:

  1. parent=None
    参数替换为
    parents=[]
    以提供完整的父级名称列表。
  2. 如果列包含嵌套
    "columns"
    • 将其名称添加到
      parent
      列表中。
    • 使用递归获取嵌套项的值路径。
    • 将这些值添加到结果
      names
  3. 如果列不包含嵌套
    "columns"
    :将其名称与
    parents
    join
    此列表与
    .
    分隔符组合起来。
  4. 归还
    names
import yaml

def test(config, parents=[]):
    names = []
    for column in config:
        
        if column.get("dtype") in ["array", "struct"] and "columns" in column:
            cur_parents = parents.copy()
            cur_parents.append(column["name"])
            children = test(column["columns"], cur_parents)
            names.extend(children)

        else:
            value = column["name"]
            value_path = parents + [value]
            names.append(".".join(value_path))

    return names

with open("input.yaml", "r") as inp:
    yaml_conf = yaml.safe_load(inp)

values = test(yaml_conf.get("columns"))

print("[\n{}\n]".format(",\n".join(values)))
© www.soinside.com 2019 - 2024. All rights reserved.