绘制按类别分组的JSON时间序列数据,并使用Pandas和Bokeh

问题描述 投票:0回答:2

我有一些每日时间序列JSON数据,其中涵盖了同一文件中的多个网站(底部的JSON单个条目示例)。我想使用Bokeh绘制这些图表,并将每个站点的时间序列(按“ system_name”分类/分组)作为同一图表上的不同彩色线。如何获得每条线的图?当前的方法正在尝试使用multi_line-使用p.line循环是否应该只是for

指导/指针非常感谢。

import json
from datetime import datetime
from pandas.io.json import json_normalize
from bokeh.plotting import figure, output_file, show
from bokeh.models import ColumnDataSource


output_file('wyndham.html')


with open('wyndham_data.txt', 'r') as f:
    a = json.load(f)

res = json_normalize(a['features'])
gby = res.groupby('properties.system_name')


for key, item in gby:
    g = item.sort_values(by='properties.date_stamp')       **<<<works to here**
    source = ColumnDataSource(dict(x = g[['properties.date_stamp']], 
            y = g[['properties.energy_prod(KWh)']]))

p = figure()
p.multi_line(x, y, source=source)
show(p)

示例JSON:


    {
    "type" : "FeatureCollection",
    "name" : "wyndham-solar-energy-production.json",
    "features" : [
        {
            "type" : "Feature",
            "geometry" : null,
            "properties" : {
                "system_id" : "9386741",
                "system_name" : "Yerambooee Community Centre  ",
                "date_stamp" : "2018-08-01",
                "energy_prod(KWh)" : 51.5,
                "energy_life(MWh)" : null,
                "C02 (Kg)" : 47.41,
                "KWp" : 18.2,
                "performance" : 2.8,
                "lat" : -37.8587717,
                "lon" : 144.7100923,
                "date_installed" : "2017-07-27"
            }

        }, ...
pandas time-series categories bokeh
2个回答
0
投票

您可以做这样的事情来画一条线:

import numpy as np
import pandas as pd
from datetime import datetime
import json
import matplotlib.pyplot as plt
import bokeh

with open('1.json', 'r+') as f:
    data = json.load(f)

df = pd.json_normalize(data['features'])
df.index = df['properties.date_stamp']
print(df)
plt.figure()
df.plot()

Plot image


0
投票

[经过大量的试验和错误后,我找到了一种使之起作用的方法。(尽管代码和输出本身并不漂亮-满足了练习的目的)。

行数和颜色数超过了Bokeh默认调色板中的颜色数。 Bokeh.palettes linear_palette函数允许我为30条线中的每条线设置唯一的颜色阴影。

在我发布的问题中,我使用了已下载JSON的本地副本,并将其保存到文本文件中。我已经添加了import requests和目标URL,以防您希望自己运行它。请注意,这大约需要15秒钟才能在我的计算机上运行。截图的图像链接:Wyndham Wind Farm Scheme Daily Ouput Plot

我也有一个SettingWithCopyWarning

SettingWithCopyWarning:试图在DataFrame的切片副本上设置一个值。尝试使用.loc [row_indexer,col_indexer] =值

请参见文档中的警告:https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copysite_data ['properties.date_stamp'] = pd.to_datetime(site_data ['properties.date_stamp'])

import requests
import pandas as pd
import json
from datetime import datetime
from pandas import json_normalize
from bokeh.plotting import figure, output_file, show
from bokeh.models import ColumnDataSource
from bokeh.palettes import linear_palette, Viridis256

URL = "https://data.gov.au/data/dataset/aa75879c-1d3e-4ad2-b331-826032c6b84b/resource/6e309687-023b-436b-9079-582b7e2fb074/download/wyndham-solar-energy-production.json"

r = requests.get(URL)
a = json.loads(r.text)


res = json_normalize(a['features'])
gby = res.groupby('properties.system_name')
sites = res['properties.system_name'].unique()
num_sites = len(sites)

output_file('wyndham.html')
plot_colors = linear_palette(Viridis256, num_sites)
p = figure(width=1800, height=900, x_axis_type="datetime", 
           title = "Wyndham Wind Farm Scheme Daily Power Output")
p.yaxis.axis_label = "Daily Power Output (kW.h)"

count = 0
for key, grp in gby:
    line_col = plot_colors[count]
    g = grp.sort_values(by='properties.date_stamp')
    site_data = g[['properties.date_stamp','properties.energy_prod(KWh)']]
    site_data['properties.date_stamp'] = pd.to_datetime(site_data['properties.date_stamp'])
    site_cds = ColumnDataSource(site_data) 
    p.line(x=site_data['properties.date_stamp'], y=site_data['properties.energy_prod(KWh)'], 
            legend_label=key, line_width = 2, line_color = line_col)
    count += 1

show(p)

Wyndham Wind Farm Scheme Daily Ouput Plot

© www.soinside.com 2019 - 2024. All rights reserved.