dataframe 到 json 列,其中多个值读取为一个字符串

问题描述 投票:0回答:1

我正在尝试将 csv 文件读入数据帧,以将其转换为嵌套的 json。 csv 文件中的分隔符是“;”。将数据加载到数据框中后,我得到以下结果:

名字 频率 频率_开始_日期 频率_计数 日程类型 一周中的某天 评论
图案1 每日 2022-02-01 1 平日 星期一
模式2 每日 2024-01-01 1, 5 平日,平日 周一、周二 空,空
模式3 每日 2021-03-21 1, 2 平日,平日 周四、周五 空,空

这是我编写的用于将 csv 加载为数据框的代码:

df1_data = pd.read_csv(csv1, delimiter=';', keep_default_na=False, dtype=object)

这是我编写的用于将数据帧转换为 json 的代码:

nested_cols = ['frequency_count', 'schedule_type', 'day_of_week']

df1_data['patterns'] = df1_data[nested_cols].to_dict('records')

df1_nested2 = df1_data[['name', 'frequency', 'frequency_start_date', 'patterns', 'comments']].to_json(orient='records', indent=4)

当我运行此程序时,我得到以下信息:

[
    {
        "name": "Pattern 1",
        "frequency": "daily",
        "frequency_start_date": "2022-02-01",
        "patterns": {
            "frequency_count": "1",
            "schedule_type": "weekdays",
            "day_of_week": null
        },
        "comments": null
    },
    {
        "name": "Pattern 2",
        "frequency": "daily",
        "frequency_start_date": "2024-01-01",
        "patterns": {
            "frequency_count": "1, 5",
            "schedule_type": "weekdays,weekdays",
            "day_of_week": "null, null"
        },
        "comments": null
    },
    {
        "name": "Pattern 3",
        "frequency": "daily",
        "frequency_start_date": "2021-03-21",
        "patterns": {
            "frequency_count": "1, 2",
            "schedule_type": "weekdays,weekdays",
            "day_of_week": "null, null"
        },
        "comments": null
    }
]

但这就是我想要的:

[
    {
        "name": "Pattern 1",
        "frequency": "daily",
        "frequency_start_date": "2022-02-01",
        "patterns": {
            "frequency_count": "1",
            "schedule_type": "weekdays",
            "day_of_week": null
        },
        "comments": null
    },
    {
        "name": "Pattern 2",
        "frequency": "daily",
        "frequency_start_date": "2024-01-01",
        "patterns": {
            "frequency_count": 1,
            "schedule_type": "weekdays",
            "day_of_week": null
        },
        {
            "frequency_count": 5,
            "schedule_type": "weekdays",
            "day_of_week": null
        },
        "comments": null
    },
    {
        "name": "Pattern 3",
        "frequency": "daily",
        "frequency_start_date": "2021-03-21",
        "patterns": {
            "frequency_count": 1,
            "schedule_type": "weekdays",
            "day_of_week": null
        },
        {
            "frequency_count": 2,
            "schedule_type": "weekdays",
            "day_of_week": null
        },
        "comments": null
    }
]

这就是我被困住的地方。这是 csv 到数据框的问题吗?或者将数据框转换成json?似乎有多个值的列被读取为字符串。我阅读了 StackOverflow 上发布的问题,但实际上没有一个问题存在,或者答案并不能帮助我解决这个问题。

任何帮助将不胜感激。

谢谢,

测试版

python json pandas csv
1个回答
0
投票

鉴于:

        name frequency frequency_start_date frequency_count       schedule_type      day_of_week    Comments
0  Pattern 1     daily           2022-02-01               1            weekdays           Monday         NaN
1  Pattern 2     daily           2024-01-01            1, 5  weekdays, weekdays  Monday, Tuesday  null, null
2  Pattern 3     daily           2021-03-21            1, 2  weekdays, weekdays  Thursday,Friday  null, null

正在做:

  1. 将字符串转换为列表:
multi_cols = ["frequency_count", "schedule_type", "day_of_week", "Comments"]

for col in multi_cols:
    df[col] = df[col].str.split(", ?") # regex
  1. 分解这些列:
df = df.explode(multi_cols)
  1. 制作此
    patterns
    列并修复
    Comments
    列:
df["patterns"] = df[["frequency_count", "schedule_type", "day_of_week"]].to_dict("records")
df["Comments"] = df["Comments"].fillna("null")
  1. 旋转并导出为 json:
output = df.pivot_table(
    index=["name", "frequency", "frequency_start_date", "Comments"], 
    values="patterns", 
    aggfunc=list, 
).reset_index().to_json(orient="records", indent=4)
print(output)

输出:

[
    {
        "name":"Pattern 1",
        "frequency":"daily",
        "frequency_start_date":"2022-02-01",
        "Comments":"null",
        "patterns":[
            {
                "frequency_count":"1",
                "schedule_type":"weekdays",
                "day_of_week":"Monday"
            }
        ]
    },
    {
        "name":"Pattern 2",
        "frequency":"daily",
        "frequency_start_date":"2024-01-01",
        "Comments":"null",
        "patterns":[
            {
                "frequency_count":"1",
                "schedule_type":"weekdays",
                "day_of_week":"Monday"
            },
            {
                "frequency_count":"5",
                "schedule_type":"weekdays",
                "day_of_week":"Tuesday"
            }
        ]
    },
    {
        "name":"Pattern 3",
        "frequency":"daily",
        "frequency_start_date":"2021-03-21",
        "Comments":"null",
        "patterns":[
            {
                "frequency_count":"1",
                "schedule_type":"weekdays",
                "day_of_week":"Thursday"
            },
            {
                "frequency_count":"2",
                "schedule_type":"weekdays",
                "day_of_week":"Friday"
            }
        ]
    }
]
© www.soinside.com 2019 - 2024. All rights reserved.