如何使用json_normalize将json转换为数据帧?

问题描述 投票:0回答:1

我正在尝试将json的响应从json转换为pandas中的数据帧。我遇到的问题是de数据嵌套在json格式中,我没有在我的数据帧中获得正确的列。

数据从api收集,格式如下:

{
    "data": [
        {
            "timestamp": "2019-04-10T11:40:13.437Z",
            "score": 87,
            "sensors": [
                {
                    "comp": "temp",
                    "value": 20.010000228881836
                },
                {
                    "comp": "humid",
                    "value": 34.4900016784668
                },
                {
                    "comp": "co2",
                    "value": 418
                },
                {
                    "comp": "voc",
                    "value": 166
                },
                {
                    "comp": "pm25",
                    "value": 4
                },
                {
                    "comp": "lux",
                    "value": 961.4000244140625
                },
                {
                    "comp": "spl_a",
                    "value": 45.70000076293945
                }
            ],
            "indices": [
                {
                    "comp": "temp",
                    "value": -1
                },
                {
                    "comp": "humid",
                    "value": -2
                },
                {
                    "comp": "co2",
                    "value": 0
                },
                {
                    "comp": "voc",
                    "value": 0
                },
                {
                    "comp": "pm25",
                    "value": 0
                }
            ]
        },
        {
            "timestamp": "2019-04-10T11:40:03.413Z",
            "score": 87,
            "sensors": [
                {
                    "comp": "temp",
                    "value": 20.040000915527344
                },
                {
                    "comp": "humid",
                    "value": 34.630001068115234
                },
                {
                    "comp": "co2",
                    "value": 418
                },
                {
                    "comp": "voc",
                    "value": 169
                },
                {
                    "comp": "pm25",
                    "value": 5
                },
                {
                    "comp": "lux",
                    "value": 960.2000122070312
                },
                {
                    "comp": "spl_a",
                    "value": 46
                }
            ],
            "indices": [
                {
                    "comp": "temp",
                    "value": -1
                },
                {
                    "comp": "humid",
                    "value": -1
                },
                {
                    "comp": "co2",
                    "value": 0
                },
                {
                    "comp": "voc",
                    "value": 0
                },
                {
                    "comp": "pm25",
                    "value": 0
                }
            ]
        },

编辑,因为你可以看到它是更多的数据集

我已经尝试过以下内容:我已将JSON格式转换为dict,然后使用normalize函数对其进行规范化。请参阅以下代码:

data = r.json() 

works_data = json_normalize(data=data['data'], record_path=['sensors'],meta=['timestamp'])

df = pd.DataFrame.from_dict(works_data)
print(df)

我得到的结果是:

 comp        value                 timestamp
0      temp    21.059999  2019-04-10T12:39:05.062Z
1     humid    31.250000  2019-04-10T12:39:05.062Z
2       co2   407.000000  2019-04-10T12:39:05.062Z
3       voc   136.000000  2019-04-10T12:39:05.062Z
4      pm25     3.000000  2019-04-10T12:39:05.062Z
5       lux  1302.099976  2019-04-10T12:39:05.062Z
6     spl_a    46.299999  2019-04-10T12:39:05.062Z

我需要的结果如下:the result

有人可以帮助我吗?

python json pandas api normalize
1个回答
0
投票

您可以重塑您的works_data:

data = {
    "data": [
        {
            "timestamp": "2019-04-10T11:40:13.437Z",
            "score": 87,
            "sensors": [
                {
                    "comp": "temp",
                    "value": 20.010000228881836
                },
                {
                    "comp": "humid",
                    "value": 34.4900016784668
                },
                {
                    "comp": "co2",
                    "value": 418
                },
                {
                    "comp": "voc",
                    "value": 166
                },
                {
                    "comp": "pm25",
                    "value": 4
                },
                {
                    "comp": "lux",
                    "value": 961.4000244140625
                },
                {
                    "comp": "spl_a",
                    "value": 45.70000076293945
                }
            ],
            "indices": [
                {
                    "comp": "temp",
                    "value": -1
                },
                {
                    "comp": "humid",
                    "value": -2
                },
                {
                    "comp": "co2",
                    "value": 0
                },
                {
                    "comp": "voc",
                    "value": 0
                },
                {
                    "comp": "pm25",
                    "value": 0
                }
            ]
        },
        {
            "timestamp": "2019-04-10T11:40:03.413Z",
            "score": 87,
            "sensors": [
                {
                    "comp": "temp",
                    "value": 20.040000915527344
                },
                {
                    "comp": "humid",
                    "value": 34.630001068115234
                },
                {
                    "comp": "co2",
                    "value": 418
                },
                {
                    "comp": "voc",
                    "value": 169
                },
                {
                    "comp": "pm25",
                    "value": 5
                },
                {
                    "comp": "lux",
                    "value": 960.2000122070312
                },
                {
                    "comp": "spl_a",
                    "value": 46
                }
            ],
            "indices": [
                {
                    "comp": "temp",
                    "value": -1
                },
                {
                    "comp": "humid",
                    "value": -1
                },
                {
                    "comp": "co2",
                    "value": 0
                },
                {
                    "comp": "voc",
                    "value": 0
                },
                {
                    "comp": "pm25",
                    "value": 0
                }
            ]
        }]}


from pandas.io.json import json_normalize
import pandas as pd        

df = pd.DataFrame()
for each in data['data']:
    timestamp = each['timestamp']
    temp_df = json_normalize(data=each, record_path=['sensors']).T

    columns = list(temp_df.iloc[0])
    data_values = list(temp_df.iloc[1,:])

    temp_df = pd.DataFrame([data_values + [timestamp]], columns=columns + ['timestamp'])

    df = df.append(temp_df).reset_index(drop=True)



print(df)

输出:

print(df)
        temp      humid    co2  ...         lux      spl_a                 timestamp
0  20.010000  34.490002  418.0  ...  961.400024  45.700001  2019-04-10T11:40:13.437Z
1  20.040001  34.630001  418.0  ...  960.200012  46.000000  2019-04-10T11:40:03.413Z

[2 rows x 8 columns]
© www.soinside.com 2019 - 2024. All rights reserved.