使用具有不同行数和列数的数据透视表

问题描述 投票:0回答:0

嗨,我有一个 json 文件,其中包含几列问题,有些与以下 json 文件中的问题相同,其他则有所不同

apend = [
    {
        "first_name": "Raúl Pedro",
        "last_name": "Moreno Zavaleta",
        "email": "[email protected]",
        "custom_questions": [
            {
                "title": "a",
                "value": "si"
            },
            {
                "title": "b",
                "value": "no"
            },
            {
                "title": "c",
                "value": "001"
            } 
        ],
        "status": "approved",
        "create_time": "2023-02-18T17:25:30Z"
    },
    {
        
        "first_name": "Milagritos",
        "last_name": "Canales Lora",
        "email": "[email protected]",
        "custom_questions": [
            {
                "title": "a",
                "value": "no"
            },
            {
                "title": "b",
                "value": "si"
            }
                        
        ],
        "status": "approved",
        "create_time": "2023-02-21T23:07:24Z",

    },
    {
            
        "first_name": "Eliza",
        "last_name": "Carbajal Leon",
        "email": "[email protected]",
        "custom_questions": [
            {
                "title": "a",
                "value": "no"
            },
            {
                "title": "e",
                "value": "identiti"
            }
                        
        ],
        "status": "approved",
        "create_time": "2023-02-21T23:07:24Z",

    }
]

我应用了以下代码来规范化数据

pp1 = pd.json_normalize(apend)
pp = pd.DataFrame.from_dict(np.concatenate(pp1\['custom_questions'\]).tolist())
crear = pd.pivot_table(pp, values='value',columns='title', aggfunc= list).reset_index()
crear = (crear.apply(lambda x: x.apply(pd.Series).stack()).reset_index().drop('index', 1))
ee = crear.drop(\["level_0", "level_1"\], axis=1).reset_index(drop=True)
unir = pd.merge(pp1,ee, how = "outer", left_index = True, right_index = True)
unir = unir.drop(\['custom_questions'\], axis = 1)

我得到这样的输出

enter image description here

但是我丢失了第三个用户的信息。列 e 列为第一个用户的信息。我需要在利用或使用 pivot 时,信息可以正确排序。

输出应该是这样的

enter image description here

python json pandas nested nested-lists
© www.soinside.com 2019 - 2024. All rights reserved.