在Python中将Json文件中的某些数据提取到CSV

问题描述 投票:0回答:1

我有一个 json 文件,其中包含员工数据,请参阅下面的代码片段:

{
"confirmMessage": null,
"meta": null,
"workers": [
    {
        "person": {
            "legalName": {
                "formattedName": "LastName, FirstName"
            },
            "militaryClassificationCodes": []
        },
        "workAssignments": [
            {
                "actualStartDate": "2020-01-01",
                "customCountryInputs": []
            }
        ],
        "workerID": {
            "idValue": "XXXXXXXXX"
        }
    },
    {
        "person": {
            "legalName": {
                "formattedName": "LastName, FirstName"
            },
            "militaryClassificationCodes": []
        },
        "workAssignments": [
            {
                "actualStartDate": "2020-01-01",
                "customCountryInputs": []
            }
        ],
        "workerID": {
            "idValue": "XXXXXXXXX"
        }
    },

我想要做的是将formattedName 和actualStartDate 导出到csv 文件。我尝试了多种方法来迭代信息,但无法以干净的格式显示出来。请参阅下面的示例:

Output

我想要的是一个 2 列 csv 文件,其标题为 formattedName 和actualStartDate。

这是我用来打开文件并写入 csv 的代码:

def csvDump():
    count = 0
    for filename in os.listdir('./Anniversaries/'):
        print(filename)
        with open('AnniversaryOutput.csv', 'a') as f:
            csv_writer = csv.writer(f)
            with open('./Anniversaries/' + filename) as json_file:
                data = json.load(json_file)
                employeeData = data['workers']
                for emp in employeeData:    
                    if count == 0:
                        header = emp.keys()
                        csv_writer.writerow(header)
                        count+=1
                csv_writer.writerow(emp.values())
            f.close()
    return()

这会产生上面的输出图像。任何建议将不胜感激。

python json csv data-conversion
1个回答
0
投票

您将

emp.keys()
写为 header,这意味着 CSV 标头将包含“person”、“workAssignments”和“workerID”。

接下来,您将

emp.values()
写为行。这实际上会将 JSON 转储到 CSV 单元格中。

您必须从 JSON 中选择所需的确切值,而不是写入所有值。


def csvDump():
    count = 0

    # Initialize the writer only once to avoid many OS calls
    with open('AnniversaryOutput.csv', 'w') as out_buffer:
         # Specify the field names
         writer = csv.DictWriter(out_buffer, fieldnames=('formattedName', 'actualStartDate'))
         # Write the header
         writer.writeheader()

         for filename in os.listdir('./Anniversaries/'):
             print(filename)
             with open('./Anniversaries/' + filename) as json_file:
                 data = json.load(json_file)
                 employeeData = data['workers']
                 # Loop over employees
                 for employee in employeeData:
                     # Extract name and start date according to the expected schema
                     formatted_name = employee["legalName"]["formattedName"]
                     actual_start_date = employee["workAssignments"][0]["actualStartDate"]  # I assumed that there's only one item in workAssignments
                     writer.writerow(
                         {
                             "actualStartDate": actual_start_date,
                             "formattedName": formatted_name
                         }
                     )

© www.soinside.com 2019 - 2024. All rights reserved.