我正在使用eBay的api,在其JSON响应中,它包含了许多不必要的数组。我正在尝试使用正则表达式删除这些数组,但无法提供所需的确切数据。
到目前为止,我已经提出了\[[^\{\}]*\]
,它匹配不包含花括号的方括号
实际:
"childCategoryHistogram": [
{
"categoryId": [
"175673"
],
"categoryName": [
"Computer Components & Parts"
],
"count": [
"21"
]
},
{
"categoryId": [
"175672"
],
"categoryName": [
"Laptops & Netbooks"
],
"count": [
"9"
]
}
]
预期:
"childCategoryHistogram": [
{
"categoryId": "175673" ],
"categoryName": "Computer Components & Parts",
"count": "21"
},
{
"categoryId": "175672",
"categoryName": "Laptops & Netbooks",
"count": "9"
}
]
正则表达式是这项工作的错误工具。不要尝试更改JSON文本 - 更改它解析的数据结构。
def remove_empty_lists(item):
if isinstance(item, list):
if len(item) == 1:
return remove_empty_lists(item[0])
else:
return [remove_empty_lists(n) for n in item]
elif isinstance(item, dict):
return {k: remove_empty_lists(v) for k, v in item.iteritems()}
else:
return item
...给定从您声明的输入创建的Python数据结构,是正确的事情:
>>> from pprint import pprint
>>> pprint(content)
{'childCategoryHistogram': [{'categoryId': ['175673'],
'categoryName': ['Computer Components & Parts'],
'count': ['21']},
{'categoryId': ['175672'],
'categoryName': ['Laptops & Netbooks'],
'count': ['9']}]}
>>> pprint(remove_empty_lists(content))
{'childCategoryHistogram': [{'categoryId': '175673',
'categoryName': 'Computer Components & Parts',
'count': '21'},
{'categoryId': '175672',
'categoryName': 'Laptops & Netbooks',
'count': '9'}]}