在扁平化 JSON 时,合并或不合并相似的列以创建具有相似列名的多个列

问题描述 投票:0回答:1

当压平 JSON 时,我得到的列值为:

{'0_extension_0_url': 'http://hl7.org/fhir/StructureDefinition/geolocation', '0_extension_0_extension_0_url': 'latitude', '0_extension_0_extension_0_valueDecimal': 42.06768934464684, '0_extension_0_extension_1_url': 'longitude', '0_extension_0_extension_1_valueDecimal': -71.17560251863814, '0_line_0': 'ratna', '0_city': 'Sharon', '0_state': 'MA', '0_postalCode': '02067', '0_country': 'US'}

所需的 CSV 格式输出:

extension_url, extension_extension_url, extension_extension_valueDecimal_latitude, extension_extension_valueDecimal_longitude, line, city, state, postalCode, country
http://hl7.org/fhir/StructureDefinition/geolocation,42.06768934464684, -71.17560251863814, ratna, Sharon, MA, 02067, US

下面的JSON数据是JSON数据的一部分:

      "address": [ {
        "extension": [ {
          "url": "http://hl7.org/fhir/StructureDefinition/geolocation",
          "extension": [ {
            "url": "latitude",
            "valueDecimal": 42.06768934464684
          }, {
            "url": "longitude",
            "valueDecimal": -71.17560251863814
          } ]
        } ],
        "line": [ "350 Frami Trafficway" ],
        "city": "Sharon",
        "state": "MA",
        "postalCode": "02067",
        "country": "US"
      } ],

Python代码:

def flatten_json(y):
    out = {}
    
    def flatten(x, name=''):
        if type(x) is dict:
            for a in x:
                flatten(x[a], name + a + '_')
        elif type(x) is list:
            i = 0
            for a in x:
                flatten(a, name + str(i) + '_')
                i += 1
        else:
            
            out[name[:-1]] = x

    flatten(y)
    return out

我希望仅创建单个字段名称,没有 JSON 位置键,并将值(如果多个)分组到给定字段名称下,以创建表格形式的数据。

python json json-flattener
1个回答
0
投票
  • 您可以删除索引代码,添加中断列表逻辑代码,具体取决于您的实际需要。
  • 基于所需的输出示例代码。
def flatten_json(y):
    out = {}
    
    def flatten(x, name=''):
        if type(x) is dict:
            for a in x:
                flatten(x[a], name + a + '_')
        elif type(x) is list:
            if len(x)>1:
                for a in x:
                    out[name + list(a.values())[0]] = list(a.values())[1]
            else:
                for a in x:
                    flatten(a, name)
        else:
            out[name[:-1]] = x

    flatten(y)
    return out

import json
import csv

f = open('test.json')
data = json.load(f)
data = flatten_json(data)
print(data)
f.close()
with open('mycsvfile.csv', 'w') as f:
    w = csv.DictWriter(f, data.keys())
    w.writeheader()
    w.writerow(data)
  • 结果:
{'address_extension_url': 'http://hl7.org/fhir/StructureDefinition/geolocation', 'address_extension_extension_latitude': 42.06768934464684, 'address_extension_extension_longitude': -71.17560251863814, 'address_line': '350 Frami Trafficway', 'address_city': 'Sharon', 'address_state': 'MA', 'address_postalCode': '02067', 'address_country': 'US'}
  • csv 结果:
address_extension_url,address_extension_extension_latitude,address_extension_extension_longitude,address_line,address_city,address_state,address_postalCode,address_country
http://hl7.org/fhir/StructureDefinition/geolocation,42.06768934464684,-71.17560251863814,350 Frami Trafficway,Sharon,MA,02067,US
© www.soinside.com 2019 - 2024. All rights reserved.