我有一些每日时间序列JSON数据,其中涵盖了同一文件中的多个网站(底部的JSON单个条目示例)。我想使用Bokeh绘制这些图表,并将每个站点的时间序列(按“ system_name”分类/分组)作为同一图表上的不同彩色线。如何获得每条线的图?当前的方法正在尝试使用multi_line
-使用p.line
循环是否应该只是for
?
指导/指针非常感谢。
import json
from datetime import datetime
from pandas.io.json import json_normalize
from bokeh.plotting import figure, output_file, show
from bokeh.models import ColumnDataSource
output_file('wyndham.html')
with open('wyndham_data.txt', 'r') as f:
a = json.load(f)
res = json_normalize(a['features'])
gby = res.groupby('properties.system_name')
for key, item in gby:
g = item.sort_values(by='properties.date_stamp') **<<<works to here**
source = ColumnDataSource(dict(x = g[['properties.date_stamp']],
y = g[['properties.energy_prod(KWh)']]))
p = figure()
p.multi_line(x, y, source=source)
show(p)
示例JSON:
{
"type" : "FeatureCollection",
"name" : "wyndham-solar-energy-production.json",
"features" : [
{
"type" : "Feature",
"geometry" : null,
"properties" : {
"system_id" : "9386741",
"system_name" : "Yerambooee Community Centre ",
"date_stamp" : "2018-08-01",
"energy_prod(KWh)" : 51.5,
"energy_life(MWh)" : null,
"C02 (Kg)" : 47.41,
"KWp" : 18.2,
"performance" : 2.8,
"lat" : -37.8587717,
"lon" : 144.7100923,
"date_installed" : "2017-07-27"
}
}, ...
您可以做这样的事情来画一条线:
import numpy as np
import pandas as pd
from datetime import datetime
import json
import matplotlib.pyplot as plt
import bokeh
with open('1.json', 'r+') as f:
data = json.load(f)
df = pd.json_normalize(data['features'])
df.index = df['properties.date_stamp']
print(df)
plt.figure()
df.plot()
[经过大量的试验和错误后,我找到了一种使之起作用的方法。(尽管代码和输出本身并不漂亮-满足了练习的目的)。
行数和颜色数超过了Bokeh默认调色板中的颜色数。 Bokeh.palettes linear_palette函数允许我为30条线中的每条线设置唯一的颜色阴影。
在我发布的问题中,我使用了已下载JSON的本地副本,并将其保存到文本文件中。我已经添加了import requests
和目标URL,以防您希望自己运行它。请注意,这大约需要15秒钟才能在我的计算机上运行。截图的图像链接:Wyndham Wind Farm Scheme Daily Ouput Plot。
我也有一个SettingWithCopyWarning
:
SettingWithCopyWarning:试图在DataFrame的切片副本上设置一个值。尝试使用.loc [row_indexer,col_indexer] =值
请参见文档中的警告:https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copysite_data ['properties.date_stamp'] = pd.to_datetime(site_data ['properties.date_stamp'])
import requests
import pandas as pd
import json
from datetime import datetime
from pandas import json_normalize
from bokeh.plotting import figure, output_file, show
from bokeh.models import ColumnDataSource
from bokeh.palettes import linear_palette, Viridis256
URL = "https://data.gov.au/data/dataset/aa75879c-1d3e-4ad2-b331-826032c6b84b/resource/6e309687-023b-436b-9079-582b7e2fb074/download/wyndham-solar-energy-production.json"
r = requests.get(URL)
a = json.loads(r.text)
res = json_normalize(a['features'])
gby = res.groupby('properties.system_name')
sites = res['properties.system_name'].unique()
num_sites = len(sites)
output_file('wyndham.html')
plot_colors = linear_palette(Viridis256, num_sites)
p = figure(width=1800, height=900, x_axis_type="datetime",
title = "Wyndham Wind Farm Scheme Daily Power Output")
p.yaxis.axis_label = "Daily Power Output (kW.h)"
count = 0
for key, grp in gby:
line_col = plot_colors[count]
g = grp.sort_values(by='properties.date_stamp')
site_data = g[['properties.date_stamp','properties.energy_prod(KWh)']]
site_data['properties.date_stamp'] = pd.to_datetime(site_data['properties.date_stamp'])
site_cds = ColumnDataSource(site_data)
p.line(x=site_data['properties.date_stamp'], y=site_data['properties.energy_prod(KWh)'],
legend_label=key, line_width = 2, line_color = line_col)
count += 1
show(p)