使用.values()和字典列表?

问题描述 投票:0回答:2

我正在比较两个不同API端点之间的json文件,以查看哪些json记录需要更新,哪些需要创建,哪些需要删除。因此,通过比较两个json文件,我想最终得到三个json文件,每个文件对应一个操作。

两个端点的json都是这样构造的(但它们对同一组值使用不同的键;不同的问题):

{
    "records": [{
        "id": "id-value-here",
        "c": {
            "d": "eee"
        },
        "f": {
            "l": "last",
            "f": "first"
        },
        "g": ["100", "89", "9831", "09112", "800"]
    }, {

        …


    }]
}

所以json表示为字典列表(带有更多嵌套列表和字典)。

如果给定的json端点(j1)id值(“id”:)存在于另一个端点json(j2)中,那么该记录应该添加到j_update。

到目前为止,我有类似的东西,但我可以看到.values()不起作用,因为它试图在列表上操作而不是在所有列出的字典(?)上操作:

j_update = {r for r in j1['records'] if r['id'] in 
j2.values()}

这不会返回错误,但会使用测试json文件创建一个空集。

看起来这应该很简单,但是对于嵌套的绊倒我想到了代表json的列表中的字典。我需要展平j2,还是有一个更简单的字典方法python必须实现这一点?

====编辑j1和j2 ====具有相同的结构,使用不同的键;玩具数据

J1

{
    "records": [{
        "field_5": 2329309841,
        "field_12": {
            "email": "[email protected]"
        },
        "field_20": {
            "last": "Mixalona",
            "first": "Clara"
        },
        "field_28": ["9002329309999", "9002329309112"],
        "field_44": ["1002329309832"]
    }, {
        "field_5": 2329309831,
        "field_12": {
            "email": "[email protected]"
        },
        "field_20": {
            "last": "Herbitz",
            "first": "Michael"
        },
        "field_28": ["9002329309831", "9002329309112", "8002329309999"],
        "field_44": ["1002329309832"]
    }, {
        "field_5": 2329309855,
        "field_12": {
            "email": "[email protected]"
        },
        "field_20": {
            "first": "Noriss",
            "last": "Katamaran"
        },
        "field_28": ["9002329309111", "8002329309112"],
        "field_44": ["1002329309877"]
    }]
}

J2

{
    "records": [{
        "id": 2329309831,
        "email": {
            "email": "[email protected]"
        },
        "name_primary": {
            "last": "Herbitz",
            "first": "Michael"
        },
        "assign": ["8003329309831", "8007329309789"],
        "hr_id": ["1002329309877"]
    }, {
        "id": 2329309884,
        "email": {
            "email": "[email protected]"
        },
        "name_primary": {
            "last": "Lee Shu",
            "first": "Yin"
        },
        "assign": ["8002329309111", "9003329309831", "9002329309111", "8002329309999", "8002329309112"],
        "hr_id": ["1002329309832"]
    }, {
        "id": 23293098338,
        "email": {
            "email": "[email protected]"
        },
        "name_primary": {
            "last": "Maxwell Louis",
            "first": "Albert"
        },
        "assign": ["8002329309111", "8007329309789", "9003329309831", "8002329309999", "8002329309112"],
        "hr_id": ["1002329309877"]
    }]
}
python json list-comprehension dictionary-comprehension
2个回答
0
投票

如果你读了json,它会输出一个dict。您正在寻找值列表中的特定键。

if 'records' in j2:
  r = j2['records'][0].get('id', []) # defaults if id does not exist

它做一个递归搜索更漂亮,但我不知道你如何组织数据以快速提出解决方案。

为了给出递归搜索的想法,请考虑这个例子

def resursiveSearch(dictionary, target):
    if target in dictionary:
        return dictionary[target]
    for key, value in dictionary.items():
        if isinstance(value, dict):
            target = resursiveSearch(value, target)
            if target:
                return target


a = {'test' : 'b', 'test1' : dict(x = dict(z = 3), y = 2)}

print(resursiveSearch(a, 'z'))

0
投票

你试过:

j_update = {r for r in j1['records'] if r['id'] in j2.values()}

除了r['id'/'field_5]问题,您还有:

>>> list(j2.values())
[[{'id': 2329309831, ...}, ...]]

id被埋在一个列表和一个字典中,因此测试r['id'] in j2.values()总是返回False。

基本的解决方案非常简单。首先,创建一组j2 ids:

>>> present_in_j2 = set(record["id"] for record in j2["records"])

然后,重建j1的json结构,但没有j1中不存在的field_5 j2

>>> {"records":[record for record in j1["records"] if record["field_5"] in present_in_j2]}
{'records': [{'field_5': 2329309831, 'field_12': {'email': '[email protected]'}, 'field_20': {'last': 'Herbitz', 'first': 'Michael'}, 'field_28': ['9002329309831', '9002329309112', '8002329309999'], 'field_44': ['1002329309832']}]}

它有效,但由于j1的奇怪键,它并不完全令人满意。让我们尝试将j1转换为更友好的格式:

def map_keys(json_value, conversion_table):
    """Map the keys of a json value
    This is a recursive DFS"""

    def map_keys_aux(json_value):
        """Capture the conversion table"""
        if isinstance(json_value, list):
            return [map_keys_aux(v) for v in json_value]
        elif isinstance(json_value, dict):
            return {conversion_table.get(k, k):map_keys_aux(v) for k,v in json_value.items()}
        else:
            return json_value

    return map_keys_aux(json_value)

该函数侧重于字典键:如果密钥存在于转换表中,则conversion_table.get(k, k)conversion_table[k],否则为密钥本身。

>>> j1toj2 = {"field_5":"id", "field_12":"email", "field_20":"name_primary", "field_28":"assign", "field_44":"hr_id"}
>>> mapped_j1 = map_keys(j1, j1toj2)

现在,代码更清晰,输出可能对PUT更有用:

>>> d1 = {record["id"]:record for record in mapped_j1["records"]}
>>> present_in_j2 = set(record["id"] for record in j2["records"])
>>> {"records":[record for record in mapped_j1["records"] if record["id"] in present_in_j2]}
{'records': [{'id': 2329309831, 'email': {'email': '[email protected]'}, 'name_primary': {'last': 'Herbitz', 'first': 'Michael'}, 'assign': ['9002329309831', '9002329309112', '8002329309999'], 'hr_id': ['1002329309832']}]}
© www.soinside.com 2019 - 2024. All rights reserved.