我有两个想要合并的字典:
a = {"name": "john",
"phone":"123123123",
"owns": {"cars": "Car 1", "motorbikes": "Motorbike 1"}}
b = {"name": "john",
"phone":"123",
"owns": {"cars": "Car 2"}}
如果
a
和 b
在同一嵌套级别上有一个公共键,则结果应该是一个列表,其中包含两个值,该列表被指定为共享键的值。
结果应该是这样的:
{"name": "john",
"phone":["123123123","123"],
"owns": {"cars": ["Car 1", "Car 2"], "motorbikes": "Motorbike 1"}}
使用
a.update(b)
不起作用,因为它会用 a
的共享值覆盖 b
的共享值,结果如下所示:
{'name': 'john', 'phone': '123', 'owns': {'cars': 'Car 2'}}
目标是合并字典而不覆盖并保留与特定键相关的所有信息(在任一字典中)。
通过递归,您可以构建一个字典理解来实现这一点。
此解决方案还考虑到您可能希望稍后合并两个以上的字典,从而在这种情况下展平值列表。
def update_merge(d1, d2):
if isinstance(d1, dict) and isinstance(d2, dict):
# Unwrap d1 and d2 in new dictionary to keep non-shared keys with **d1, **d2
# Next unwrap a dict that treats shared keys
# If two keys have an equal value, we take that value as new value
# If the values are not equal, we recursively merge them
return {
**d1, **d2,
**{k: d1[k] if d1[k] == d2[k] else update_merge(d1[k], d2[k])
for k in {*d1} & {*d2}}
}
else:
# This case happens when values are merged
# It bundle values in a list, making sure
# to flatten them if they are already lists
return [
*(d1 if isinstance(d1, list) else [d1]),
*(d2 if isinstance(d2, list) else [d2])
]
示例:
a = {"name": "john", "phone":"123123123",
"owns": {"cars": "Car 1", "motorbikes": "Motorbike 1"}}
b = {"name": "john", "phone":"123", "owns": {"cars": "Car 2"}}
update_merge(a, b)
# {'name': 'john',
# 'phone': ['123123123', '123'],
# 'owns': {'cars': ['Car 1', 'Car 2'], 'motorbikes': 'Motorbike 1'}}
合并两个以上对象的示例:
a = {"name": "john"}
b = {"name": "jack"}
c = {"name": "joe"}
d = update_merge(a, b)
d = update_merge(d, c)
d # {'name': ['john', 'jack', 'joe']}
您可以使用
itertools.groupby
和递归:
import itertools, sys
a = {"name": "john", "phone":"123123123", "owns": {"cars": "Car 1", "motorbikes": "Motorbike 1"}}
b = {"name": "john", "phone":"123", "owns": {"cars": "Car 2"}}
def condense(r):
return r[0] if len(set(r)) == 1 else r
def update_dict(c, d):
_v = {j:[c for _, c in h] for j, h in itertools.groupby(sorted(list(c.items())+list(d.items()), key=lambda x:x[0]), key=lambda x:x[0])}
return {j:update_dict(*e) if all(isinstance(i, dict) for i in e) else condense(e) for j, e in _v.items()}
print(update_dict(a, b))
输出:
{'name': 'john', 'owns': {'cars': ['Car 1', 'Car 2'], 'motorbikes': 'Motorbike 1'}, 'phone': ['123123123', '123']}
使用集合和事物,还可以合并任意数量的字典:
from functools import reduce
import operator
# Usage: merge(a, b, ...)
def merge(*args):
# Make a copy of the input dicts, can be removed if you don't care about modifying
# the original dicts.
args = list(map(dict.copy, args))
# Dict to store the result.
out = {}
for k in reduce(operator.and_, map(dict.keys, args)): # Python 3 only, see footnotes.
# Use `.pop()` so that after the all elements of shared keys have been combined,
# `args` becomes a list of disjoint dicts that we can merge easily.
vs = [d.pop(k) for d in args]
if isinstance(vs[0], dict):
# Recursively merge nested dicts
common = merge(*vs)
else:
# Use a set to collect unique values
common = set(vs)
# If only one unique value, store that as is, otherwise use a list
common = next(iter(common)) if len(common) == 1 else list(common)
out[k] = common
# Merge into `out` the rest of the now disjoint dicts
for arg in args:
out.update(arg)
return out
假设要合并的每个字典具有相同的“结构”,因此
"owns"
不能是a
中的列表和b
中的字典。字典的每个元素也需要是可哈希的,因为此方法使用集合来聚合唯一值。
以下内容仅适用于 Python 3,因为在 Python 2 中,
dict.keys()
返回一个普通的旧列表。
reduce(operator.and_, map(dict.keys, args))
另一种方法是添加额外的
map()
将列表转换为集合:
reduce(operator.and_, map(set, map(dict.keys, args)))
这是支持任意数量参数的通用解决方案:
def _merge_dicts(dict_args):
if not isinstance(dict_args[0], dict):
return list(set(dict_args)) if len(set(dict_args)) > 1 else dict_args[0]
keys = set().union(*dict_args)
result = {key:
_merge_dicts(([d.get(key, None) for d in dict_args if d.get(key, None) is not None]))
for key in keys}
return result
def merge_dicts(*dict_args):
return _merge_dicts(dict_args)
a = {"name": "john",
"phone":"123123123",
"owns": {"cars": "Car 1", "motorbikes": "Motorbike 1"}}
b = {"name": "john",
"phone":"123",
"owns": {"cars": "Car 2"}}
merge_dicts(a, b)
产量
{'name': 'john',
'owns': {'motorbikes': 'Motorbike 1', 'cars': ['Car 2', 'Car 1']},
'phone': ['123123123', '123']}