如何将这个JSON时间序列数据转换为pandas数据框?

问题描述 投票:0回答:1

我有一个来自API的时间序列数据,我想进入python pandas数据框。您将如何做?

数据看起来像这样:

[{'id': 38421212541,
  'sensor_id': 12944473,
  'value': '6852.426',
  'date': '2015-02-05',
  'min': '0.0',
  'max': '833.789',
  'avg': '285.5177',
  'values': '[344.336,306.05449999999996,269.922,233.9845,198.63299999999998,164.0625,128.9065,96.2895,66.797,50.391,27.344,10.938,8.203,6.641,4.297,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,10.352,23.047,41.797,59.961,93.75,135.938,171.875,209.375,247.266,304.1015,345.703,387.5,438.4765,489.6485,524.414,579.688,627.9300000000001,680.859,722.656,776.563,802.344,811.5235,816.016,822.2655,826.3675000000001,830.2735,830.078,831.836,831.836,833.789,831.25,828.906,828.125,828.3205,827.3435,825,821.875,820.508,817.188,811.7185,810.547,807.422,803.516,801.172,796.094,792.7735,788.672,786.328,783.5935,778.125,775,772.266,768.359,765.625,763.672,759.766,758.203,757.422,753.516,752.734,748.633,751.172,748.047,747.461,747.266,749.6095,747.266,748.828,747.266,748.438,752.9295,750.1949999999999,755.078,756.641,759.961,760.1565,764.4535000000001,767.383,767.188,770.703,774.219,774.6095,780.469,782.8125,785.547,785.156,793.359,794.531,795.8985,798.047,800,805.859,805.469,810.1565,815.4295,811.719,814.453,817.578,814.063,817.188,817.969,813.281,812.5,810.938,786.133,755.859,719.922,676.3675000000001,642.383,606.0550000000001,565.039,534.1795,494.922,448.24199999999996,412.305,379.8825]',
  'start_time': None,
  'start_value': None},
 {'id': 38421212540,
  'sensor_id': 12944473,
  'value': '6771.2902',
  'date': '2015-02-04',
  'min': '0.0',
  'max': '828.9065',
  'avg': '282.1371',
  'values': '[333.0075,297.0705,253.32049999999998,211.7185,173.828,157.422,100.391,100,69.5315,36.3285,24.219,13.281,12.891,10.547,6.836,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1.172,17.188,35.547,55.078,78.516,118.75,160.938,192.969,210.156,262.8905,309.766,376.953,416.0155,460.156,511.719,567.578,609.1800000000001,662.5,703.516,747.8515,777.344,792.578,801.3675000000001,812.6955,819.727,825,828.906,822.266,825,825.781,827.344,827.9295,828.9065,825.781,817.578,810.9375,806.25,806.25,803.516,799.0235,796.484,794.336,792.7735,787.6955,784.766,782.422,777.734,775.781,773.438,768.75,764.8435,763.086,763.672,754.883,755.8595,759.766,754.1015,749.4145,746.875,748.2425000000001,744.141,745.313,747.6565,742.969,739.8435,742.7735,742.7735,744.141,748.438,749.0235,740.039,750.781,748.242,751.172,755.664,757.813,764.258,765.625,763.672,766.992,770.8985,772.266,773.2425000000001,781.641,787.5,785.938,789.453,792.188,796.6800000000001,800.3905,798.8285000000001,801.563,802.344,805.859,805.2735,813.2815,808.984,810.156,813.672,812.109,812.6955,814.0625,808.5935,808.008,777.9300000000001,747.0705,707.8125,677.7345,636.7185,602.3435,566.992,529.4925000000001,486.3285,451.172,417.7735,382.8125]',
  'start_time': None,
  'start_value': None},
 {'id': 38421163212,
  'sensor_id': 12944473,
  'value': '6652.3768',
  'date': '2015-02-03',
  'min': '0.0',
  'max': '807.2265',
  'avg': '277.1824',
  'values': '[331.44550000000004,297.65599999999995,256.25,220.8985,184.57,149.6095,100.9765,66.9925,31.6405,42.382999999999996,17.969,12.5,8.594,6.641,3.906,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.391,12.6955,20.8985,48.047,78.516,107.6175,150.586,185.9375,222.2655,260.15650000000005,312.8905,350,402.344,436.133,486.719,536.328,584.375,637.1095,675.781,717.969,762.8905,771.4845,778.5155,784.961,790.4295,791.406,797.461,800.7815,802.9295,803.5155,804.8824999999999,805.469,804.6875,803.3205,800,800.1955,798.047,797.8515,795.508,792.578,789.453,787.6955,784.766,779.297,777.9295,775.586,771.094,768.5545,766.016,762.5,758.203,756.25,753.906,750.391,747.656,746.6800000000001,741.797,738.672,736.5235,734.5705,738.086,732.422,732.422,733.594,732.813,732.0315,733.984,734.766,735.3515,736.719,737.6955,738.672,740.234,746.094,744.531,752.9295,751.172,755.859,757.031,761.7185,764.258,766.406,771.094,773.047,776.953,778.125,783.008,783.984,787.891,793.75,795.703,798.047,800.586,799.6095,798.8285000000001,800.9765,806.25,805.664,805.469,805.859,806.25,807.2265,804.883,803.125,772.8515,739.258,698.4375,665.625,624.0235,594.336,551.5625,514.6485,471.875,437.30449999999996,402.539,369.336]',
  'start_time': None,
  'start_value': None}]

我通过以下方式获得了该数据:

import requests
import json

# query for data from sensor with id number "12944473"
# populate response 'r' and JSON 'j' with the first page of the response
r = requests.get('https://foo-bar-company.com/sensors/12944473/daily_data', auth=(user, pwd))
j = r.json()

# see how many pages there are.
# loop through all pages, appending to 'j'
last_uri = r.links['last']['url']

while r.url != last_uri:
    r = requests.get(r.links['next']['url'], auth=(user, pwd))
    j.append(r.json())

# print a few days to share with SO
print(j[0:3:1])

目标是这样的:

                12944473
10/2/2015 0:00  344.336
10/2/2015 0:05  306.0545
10/2/2015 0:10  269.922
10/2/2015 0:15  233.9845
10/2/2015 0:20  198.633
10/2/2015 0:25  164.0625
10/2/2015 0:30  128.9065
10/2/2015 0:35  96.2895
10/2/2015 0:40  66.797
10/2/2015 0:45  50.391
10/2/2015 0:50  27.344
10/2/2015 0:55  10.938
...

我尽力做到这一点,但很高兴添加更多细节。数据是日光测量值。这个问题与我发现的其他帖子有所不同,因为它是时间序列数据。

构建日期时间索引时,如何保护DST和leap年问题?

我是Python的新手,我希望有一个高雅的解决方案,我只是从未见过。预先感谢!


Python 3.7foo bar公司的RESTful API

python json pandas time-series
1个回答
0
投票

假设您拥有最新版本的熊猫。

data = [{'date': x['date'], 'values': eval(x['values'])} for x in your_json_dict]
pd.DataFrame(data).explode('values')

结果

          date   values
0   2015-02-05  344.336
0   2015-02-05  306.054
0   2015-02-05  269.922
0   2015-02-05  233.984
0   2015-02-05  198.633
..         ...      ...
2   2015-02-03  514.649
2   2015-02-03  471.875
2   2015-02-03  437.304
2   2015-02-03  402.539
2   2015-02-03  369.336
© www.soinside.com 2019 - 2024. All rights reserved.