这真的让我很沮丧,我觉得我已经尝试了一切。我有一个基本的 Pandas 数据框,如下所示:
order name lat long open close
123 Walgreens 37.5 50.4 08:00:00 17:00:00
456 CVS 16.7 52.4 09:00:00 12:00:00
789 McDonald's 90.7 59.1 12:00:00 14:00:00
我需要将该数据帧转换为如下所示的 JSON 对象:
{
"123": {
"Location": {
"Name": "Walgreens",
"Lat": 37.5,
"Long": 50.4
},
"Open": 08:00:00,
"Close": 17:00:00
},
"456": {
"Location": {
"Name": "CVS",
"Lat": 16.7,
"Long": 52.4
},
"Open": 09:00:00,
"Close": 12:00:00
},
"789": {
"Location": {
"Name": "McDonald's",
"Lat": 90.7,
"Long": 59.1
},
"Open": 12:00:00,
"Close" : 14:00:00 } } }
我已经尝试了很多方法试图让它看起来像那样,但是要么我被额外的斜线所困扰,要么无论我做什么都无法得到正确的报价。我已经完成了 Pandas to_json 方法并将其放入字典中,然后完成了 json.loads 或 json.dumps ,但它无法正常工作。
我尝试过的一种方法是这样做:
json_dict = {}
for i in df.index:
order_no = df.loc[i, 'order_no']
stop_name = df.loc[i, 'Name']
lat = df.loc[i, 'latitude']
lng = df.loc[i, 'longitude']
start = df.loc[i, 'start']
end = df.loc[i, 'end']
json_dict[str(order_no)] = '{{"location" : {{ "name": "{0}",
"lat" : "{1}", "long" : "{2}" }}, "open" : "{3}", "close" : "{4}"
}}'.format(name, lat, long, start, end)
json.dumps(json_dict)
最终会在其中添加一大堆反斜杠。如何获得正确的格式?感谢您的帮助!
使用源数据框,
df
看起来像:
order name lat long open close
123 Walgreens 37.5 50.4 08:00:00 17:00:00
456 CVS 16.7 52.4 09:00:00 12:00:00
789 McDonald's 90.7 59.1 12:00:00 14:00:00
为了获得所需的输出json,我们需要执行以下操作:
Location
列,聚合name
、lat
、long
order
是顶级键代码:
# import json & pprint to pretty print the output
import json
import pprint
import pandas as pd
df.columns = [x.capitalize() for x in df.columns]
location_keys = ['Name', 'Lat', 'Long']
df['Location'] = df[location_keys].to_dict(orient='records')
json_str = df.set_index('Order').drop(location_keys, axis=1).to_json(orient='index')
# print output with nice json formatting
pprint.pprint(json.loads(json_str))
# outputs:
{'123': {'Close': '17:00:00',
'Location': {'Lat': '37.5', 'Long': '50.4', 'Name': 'Walgreens'},
'Open': '08:00:00'},
'456': {'Close': '12:00:00',
'Location': {'Lat': '16.7', 'Long': '52.4', 'Name': 'CVS'},
'Open': '09:00:00'},
'789': {'Close': '14:00:00',
'Location': {'Lat': '90.7', 'Long': '59.1', 'Name': "McDonald's"},
'Open': '12:00:00'}}
如果将索引设置为
order
,则可以定位于index
:
import pandas as pd
records
[{'order': '123', 'name': 'Walgreens', 'lat': '37.5', 'long': '50.4', 'open': '08:00:00', 'close': '17:00:00'}, {'order': '456', 'name': 'CVS', 'lat': '16.7', 'long': '52.4', 'open': '09:00:00', 'close': '12:00:00'}, {'order': '789', 'name': "McDonald's", 'lat': '90.7', 'long': '59.1', 'open': '12:00:00', 'close': '14:00:00'}]
df = pd.DataFrame(records)
df = df.set_index('order')
现在
df
看起来像
close lat long name open
order
123 17:00:00 37.5 50.4 Walgreens 08:00:00
456 12:00:00 16.7 52.4 CVS 09:00:00
789 14:00:00 90.7 59.1 McDonald's 12:00:00
将其传递给 python
dict
df.to_dict(orient='index')
{
"123": {
"close": "17:00:00",
"lat": "37.5",
"long": "50.4",
"name": "Walgreens",
"open": "08:00:00"
},
"456": {
"close": "12:00:00",
"lat": "16.7",
"long": "52.4",
"name": "CVS",
"open": "09:00:00"
},
"789": {
"close": "14:00:00",
"lat": "90.7",
"long": "59.1",
"name": "McDonald's",
"open": "12:00:00"
}
}
作为一个完整的声明
# if you prefer a one-liner
# as python dict
json_dict = df.set_index('order').to_dict(orient='index')
# or as json string
json_string = df.set_index('order').to_json(orient='index')
df1.drop(["name","lat","long"],1).assign(Location=df1.loc[:,["name","lat","long"]].apply(dict,1)).set_index("order").to_dict(orient='index')
{123: {'open': '08:00:00',
'close': '17:00:00',
'Location': {'name': 'Walgreens', 'lat': 37.5, 'long': 50.4}},
456: {'open': '09:00:00',
'close': '12:00:00',
'Location': {'name': 'CVS', 'lat': 16.7, 'long': 52.4}},
789: {'open': '12:00:00',
'close': '14:00:00',
'Location': {'name': "McDonald's", 'lat': 90.7, 'long': 59.1}}}
tl;博士
我也遇到了类似的困难,试图从 Pandas 数据帧中获取正确的 JSON 格式,我想用它来驱动 API。我从我们通常使用 SQL 的方式中得到了启发,在使用内置函数转换为日期之前,我们将有问题的日期值解析为字符串。 ... 你可以考虑做
json.dumps(json.loads(data_frame.to_json(orient="records")))
如果有帮助的话