我有一个 json 文件,其中包含员工数据,请参阅下面的代码片段:
{
"confirmMessage": null,
"meta": null,
"workers": [
{
"person": {
"legalName": {
"formattedName": "LastName, FirstName"
},
"militaryClassificationCodes": []
},
"workAssignments": [
{
"actualStartDate": "2020-01-01",
"customCountryInputs": []
}
],
"workerID": {
"idValue": "XXXXXXXXX"
}
},
{
"person": {
"legalName": {
"formattedName": "LastName, FirstName"
},
"militaryClassificationCodes": []
},
"workAssignments": [
{
"actualStartDate": "2020-01-01",
"customCountryInputs": []
}
],
"workerID": {
"idValue": "XXXXXXXXX"
}
},
我想要做的是将formattedName 和actualStartDate 导出到csv 文件。我尝试了多种方法来迭代信息,但无法以干净的格式显示出来。请参阅下面的示例:
我想要的是一个 2 列 csv 文件,其标题为 formattedName 和actualStartDate。
这是我用来打开文件并写入 csv 的代码:
def csvDump():
count = 0
for filename in os.listdir('./Anniversaries/'):
print(filename)
with open('AnniversaryOutput.csv', 'a') as f:
csv_writer = csv.writer(f)
with open('./Anniversaries/' + filename) as json_file:
data = json.load(json_file)
employeeData = data['workers']
for emp in employeeData:
if count == 0:
header = emp.keys()
csv_writer.writerow(header)
count+=1
csv_writer.writerow(emp.values())
f.close()
return()
这会产生上面的输出图像。任何建议将不胜感激。
您将
emp.keys()
写为 header,这意味着 CSV 标头将包含“person”、“workAssignments”和“workerID”。
接下来,您将
emp.values()
写为行。这实际上会将 JSON 转储到 CSV 单元格中。
您必须从 JSON 中选择所需的确切值,而不是写入所有值。
def csvDump():
count = 0
# Initialize the writer only once to avoid many OS calls
with open('AnniversaryOutput.csv', 'w') as out_buffer:
# Specify the field names
writer = csv.DictWriter(out_buffer, fieldnames=('formattedName', 'actualStartDate'))
# Write the header
writer.writeheader()
for filename in os.listdir('./Anniversaries/'):
print(filename)
with open('./Anniversaries/' + filename) as json_file:
data = json.load(json_file)
employeeData = data['workers']
# Loop over employees
for employee in employeeData:
# Extract name and start date according to the expected schema
formatted_name = employee["legalName"]["formattedName"]
actual_start_date = employee["workAssignments"][0]["actualStartDate"] # I assumed that there's only one item in workAssignments
writer.writerow(
{
"actualStartDate": actual_start_date,
"formattedName": formatted_name
}
)