如何将加权边列表导出到JSON树?

问题描述 投票:0回答:2

给定以下 Pandas DataFrame(原始 DataFrame 有 200 多行):

import pandas as pd
df = pd.DataFrame({
    'child': ['Europe', 'France', 'Paris','North America', 'US', 'Canada'],
    'parent': ["", 'Europe', 'France',"", 'North America', 'North America'],
    'value': [746.4, 67.75, 2.16, 579,331.9, 38.25]
})

df

|---+---------------+---------------+--------|
|   | child         | parent        |  value |
|---+---------------+---------------+--------|
| 0 | Europe        |               | 746.40 |
| 1 | France        | Europe        |  67.75 |
| 2 | Paris         | France        |   2.16 |
| 3 | North America |               | 579.00 |
| 4 | US            | North America | 331.90 |
| 5 | Canada        | North America |  38.25 |
|---+---------------+---------------+--------|

我想生成以下 JSON 树:

  [
      {
      name: 'Europe',
      value: 746.4,
      children: [
          {
          name: 'France',
          value: 67.75,
          children: [
              {
              name: 'Paris',
              value: 2.16
              }
          ]
          }
      ]
      },
      {
      name: 'North America',
      value: 579,
      children: [
          {
          name: 'US',
          value: 331.9,
          },
          {
          name: 'Canada',
          value: 38.25
          }
      ]
      }
  ];

该树将用作 ECharts 可视化的输入,例如这个基本旭日图

json pandas networkx echarts edge-list
2个回答
1
投票

您可以使用

networkx
包来实现此目的。首先将数据框转换为图表:

import networkx as nx

G = nx.from_pandas_edgelist(df, source='parent', target='child', edge_attr='value', create_using=nx.DiGraph)
nx.draw(G, with_labels=True)

这将产生一个加权图:

接下来,我们将图表获取为 JSON 格式的树:

from networkx.readwrite import json_graph

data = json_graph.tree_data(G, root='')
data = data['children']  # remove the root

这将如下所示:

[{'id': 'Europe',
  'children': [{'id': 'France', 'children': [{'id': 'Paris'}]}]},
 {'id': 'North America', 'children': [{'id': 'US'}, {'id': 'Canada'}]}]

最后,通过添加回值并将“id”重命名为“name”来对 JSON 数据进行后处理。也许有更好的方法来做到这一点,但下面的方法有效。

edge_values = nx.get_edge_attributes(G,'value')

def post_process_json(data, parent=''):
    print(data)
    data['name'] = data.pop('id')
    data['value'] = edge_values[(parent, data['name'])]
    if 'children' in data.keys():
        data['children'] = [post_process_json(child, parent=data['name']) for child in data['children']]
    return data

data = [post_process_json(d) for d in data]

最终结果:

[{'children': [{'children': [{'name': 'Paris', 'value': 2.16}],
    'name': 'France',
    'value': 67.75}],
  'name': 'Europe',
  'value': 746.4},
 {'children': [{'name': 'US', 'value': 331.9},
   {'name': 'Canada', 'value': 38.25}],
  'name': 'North America',
  'value': 579.0}]

0
投票

您可以首先将各个节点创建为

{ name, value }
字典,并按名称键入它们。然后将它们连接起来:

result = []
d = { "": { "children": result } }
for child, value in zip(df["child"], df["value"]):
    d[child] = { "name": child, "value": value }
for child, parent in zip(df["child"], df["parent"]):
    if "children" not in d[parent]:
        d[parent]["children"] = []
    d[parent]["children"].append(d[child])

对于我们的示例,

result
将是:

[{
    'name': 'Europe', 
    'value': 746.4, 
    'children': [{
        'name': 'France', 
        'value': 67.75, 
        'children': [{
            'name': 'Paris', 
            'value': 2.16
        }]
    }]
}, {
    'name': 'North America', 
    'value': 579.0, 
    'children': [{
        'name': 'US', 
        'value': 331.9
    }, {
        'name': 'Canada', 
        'value': 38.25
    }]
}]
© www.soinside.com 2019 - 2024. All rights reserved.