从 pandas df 到 list 生成分层数据

问题描述 投票:0回答:1

我有这种形式的数据

data = [
    [2019, "July", 8, '1.2.0', 7.0, None, None],
    [2019, "July", 10, '1.2.0', 52.0, "Breaking", 6.0, 'Path Removed w/o Deprecation'],
    [2019, "July", 15, "0.1.0", 210.0, "Breaking", 57.0, 'Request Parameter Removed'],
    [2019, 'August', 20, '2.0.0', 100.0, "Breaking", None, None],
    [2019, 'August', 25, '2.0.0', 200.0, 'Non-breaking', None, None],
]

列表按此层次结构排列:

Year, Month, Day, info_version, API_changes, type1, count, content

我想为数据生成这种分层树结构:

{
  "name": "2020", # this is year
  "children": [
    {
      "name": "July", # this is month
      "children": [
        {
          "name": "10",   #this is day
          "children": [
            {
              "name": "1.2.0",   # this is info_version
              "value": 52,        # this is value of API_changes(always a number)
              "children": [
                {
                  "name": "Breaking",   # this is type1 column( it is string, it is either Nan or Breaking)
                  "value": 6,                   # this is value of count
                  "children": [
                    {
                      "name": "Path Removed w/o Deprecation",      #this is content column
                      "value": 6        # this is value of count
                    }
                  ]
                }
              ]
            }
          ]
        }
      ]
    }
  ]
}

对于所有其他月份,它以相同的格式继续。我不想以任何方式修改我的数据,这就是我的用例(图形目的)应该是这样的。我不确定如何实现这一点,任何建议都会非常感激。

这是参考

pyecharts

中Sunburst图的这种格式
python hierarchy hierarchical
1个回答
0
投票

首先你需要用你拥有的所有不同的键创建一个嵌套的字典,然后递归地构建你的结构

from collections import defaultdict

def to_keys(values):
    if isinstance(values, tuple):
        return {"name": values[0], "value": values[1]}
    return {"name": values}    

def to_children(values):
    if isinstance(values, list):
        return [to_children(item) for item in values]
    if isinstance(values, tuple):
        return to_keys(values)
    if isinstance(values, dict):
        return [{**to_keys(key), "children": to_children(value)}
                for key, value in values.items()]
    raise Exception("invalid type")

gen = lambda: defaultdict(gen)
result = defaultdict(gen)

data = [
    [2019, "July", 10, '1.2.0', 52.0, 'Breaking', 6, None],
    [2019, "July", 10, '1.2.0', 52.0, "Breaking", 6.0, 'Path Removed w/o Deprecation'],
    [2019, "July", 15, "0.1.0", 210.0, "Breaking", 57.0, 'Request Parameter Removed'],
    [2019, 'August', 20, '2.0.0', 100.0, "Breaking", None, None],
    [2019, 'August', 25, '2.0.0', 200.0, 'Non-breaking', None, None],
]

for year, month, day, info_version, api_changes, type1, count, content in data:
    result[year][month][day][(info_version, api_changes)].setdefault((type1, count), []).append((content, count))

final_result = to_children(result)
print(final_result)
© www.soinside.com 2019 - 2024. All rights reserved.