我刚刚从spotify下载了一些json,并查看了pd.normalize_json()。
但是如果我规范化数据,我的数据框内仍然会有字典。同样设置级别也无济于事。
我想在数据框中包含的数据:
{
"collaborative": false,
"description": "",
"external_urls": {
"spotify": "https://open.spotify.com/playlist/5"
},
"followers": {
"href": null,
"total": 0
},
"href": "https://api.spotify.com/v1/playlists/5?additional_types=track",
"id": "5",
"images": [
{
"height": 640,
"url": "https://i.scdn.co/image/a",
"width": 640
}
],
"name": "Another",
"owner": {
"display_name": "user",
"external_urls": {
"spotify": "https://open.spotify.com/user/user"
},
"href": "https://api.spotify.com/v1/users/user",
"id": "user",
"type": "user",
"uri": "spotify:user:user"
},
"primary_color": null,
"public": true,
"snapshot_id": "M2QxNTcyYTkMDc2",
"tracks": {
"href": "https://api.spotify.com/v1/playlists/100&additional_types=track",
"items": [
{
"added_at": "2020-12-13T18:34:09Z",
"added_by": {
"external_urls": {
"spotify": "https://open.spotify.com/user/user"
},
"href": "https://api.spotify.com/v1/users/user",
"id": "user",
"type": "user",
"uri": "spotify:user:user"
},
"is_local": false,
"primary_color": null,
"track": {
"album": {
"album_type": "album",
"artists": [
{
"external_urls": {
"spotify": "https://open.spotify.com/artist/1dfeR4Had"
},
"href": "https://api.spotify.com/v1/artists/1dfDbWqFHLkxsg1d",
"id": "1dfeR4HaWDbWqFHLkxsg1d",
"name": "Q",
"type": "artist",
"uri": "spotify:artist:1dfeRqFHLkxsg1d"
}
],
"available_markets": [
"CA",
"US"
],
"external_urls": {
"spotify": "https://open.spotify.com/album/6wPXmlLzZ5cCa"
},
"href": "https://api.spotify.com/v1/albums/6wPXUJ9LzZ5cCa",
"id": "6wPXUmYJ9zZ5cCa",
"images": [
{
"height": 640,
"url": "https://i.scdn.co/image/ab676620a47",
"width": 640
},
{
"height": 300,
"url": "https://i.scdn.co/image/ab67616d0620a47",
"width": 300
},
{
"height": 64,
"url": "https://i.scdn.co/image/ab603e6620a47",
"width": 64
}
],
"name": "The (Deluxe ",
"release_date": "1920-07-17",
"release_date_precision": "day",
"total_tracks": 15,
"type": "album",
"uri": "spotify:album:6m5cCa"
},
"artists": [
{
"external_urls": {
"spotify": "https://open.spotify.com/artist/1dg1d"
},
"href": "https://api.spotify.com/v1/artists/1dsg1d",
"id": "1dfeR4HaWDbWqFHLkxsg1d",
"name": "Q",
"type": "artist",
"uri": "spotify:artist:1dxsg1d"
}
],
"available_markets": [
"CA",
"US"
],
"disc_number": 1,
"duration_ms": 21453,
"episode": false,
"explicit": false,
"external_ids": {
"isrc": "GBU6015"
},
"external_urls": {
"spotify": "https://open.spotify.com/track/5716J"
},
"href": "https://api.spotify.com/v1/tracks/5716J",
"id": "5716J",
"is_local": false,
"name": "Another",
"popularity": 73,
"preview_url": null,
"track": true,
"track_number": 3,
"type": "track",
"uri": "spotify:track:516J"
},
"video_thumbnail": {
"url": null
}
}
],
"limit": 100,
"next": null,
"offset": 0,
"previous": null,
"total": 1
},
"type": "playlist",
"uri": "spotify:playlist:fek"
}
那么,将这样的嵌套数据读入熊猫的一个数据框中的最佳实践是什么?
我很高兴提供任何建议。
编辑:
所以基本上,我希望所有键都作为数据帧中的列。但是通过归一化,它会停在“ tracks.items”,如果我再次对其进行归一化,我将再次遇到递归问题。
取决于您要查找的信息。看一下pandas.read_json()看看是否可以工作。您也可以这样选择数据
json_output = {"collaborative": 'false',"description": "", "external_urls": {"spotify": "https://open.spotify.com/playlist/5"}}
df['collaborative'] = json_output['collaborative'] #set value of your df to value of returned json values