如何使用可变 Excel 输入和输出在 Bokeh 中创建动态堆叠条形图

Question

我目前正在从事一个数据可视化项目。我需要使用 Bokeh 创建堆叠条形图。数据来源于定期更新的 Excel 文件，包含多个输入和输出。数据的结构可能会有所不同，这意味着输入和输出的数量可能会发生变化。我的目标是确保 Bokeh 图能够自动适应这些变化，而不需要手动调整代码。堆积条形图应该能够动态调整以适应 Excel 文件中输入和输出数量的变化。它应该能够可视化固定输入的各种组合（例如输入 1 固定、输入 2 固定、输入 3 变量），同时显示所有相应的输出。理想情况下，该解决方案将自动读取 Excel 文件，检测输入和输出的结构，并相应地更新图表。

我尝试可视化一组场景（2 个输入和 3 个输出）的堆积条形图。在示例 Excel 文件中，数据存储在如下方案中：

示例方案：

输入_1	输入_2	输出_1	输出_2	输出_3
1	1	100	200	200
2	1	150	150	200
3	1	200	100	200
1	2	200	200	100
2	2	150	200	150
3	2	100	200	200
1	3	200	100	200
2	3	200	150	150
3	3	200	200	100

它适用于这个静态场景：

import pandas as pd
from bokeh.plotting import figure, output_file, show
from bokeh.layouts import gridplot, row
from bokeh.models import ColumnDataSource


data_frame = pd.read_excel("example.xlsx")

data_frame['Input_1'] = data_frame['Input_1'].astype(str)
data_frame['Input_2'] = data_frame['Input_2'].astype(str)

output_file("stacked_bar_charts.html")

unique_input_1 = data_frame['Input_1'].unique()
unique_input_2 = data_frame['Input_2'].unique()

plots_for_input_1 = []
plots_for_input_2 = []


for value in unique_input_1:

    filtered_data = data_frame[data_frame["Input_1"] == value]
    filtered_data = filtered_data.sort_values(by='Input_2')

    source = ColumnDataSource(filtered_data)

    plot = figure(title=f"Input_1 = {value} fixed",
                  x_range=filtered_data["Input_2"].unique(),
                  height=300,
                  width=500
                  )

    plot.xaxis.axis_label = "Input_2"

    plot.vbar_stack(stackers=["Output_1", "Output_2", "Output_3"],
                    x="Input_2",
                    width=0.9,
                    color=["orange", "gray", "brown"],
                    source=source,
                    legend_label=["Output 1", "Output 2", "Output 3"]
                    )

    plots_for_input_1.append(plot)


for value in unique_input_2:

    filtered_data = data_frame[data_frame['Input_2'] == value]
    filtered_data = filtered_data.sort_values(by='Input_1')

    source = ColumnDataSource(filtered_data)

    plot = figure(title=f"Input_2 = {value} fixed", 
                  x_range=filtered_data["Input_1"].unique(), 
                  height=300,
                  width=500
                  )

    plot.xaxis.axis_label = "Input_1"

    plot.vbar_stack(stackers=["Output_1", "Output_2", "Output_3"],
                    x="Input_1",
                    width=0.9,
                    color=["orange", "gray", "brown"],
                    source=source,
                    legend_label=["Output 1", "Output 2", "Output 3"]
                    )

    plots_for_input_2.append(plot)


grid_for_input_1 = gridplot(plots_for_input_1, ncols=1)  
grid_for_input_2 = gridplot(plots_for_input_2, ncols=1)

final_layout = row(grid_for_input_1, grid_for_input_2)


show(final_layout)

以下是输出示例图片，以展示我如何可视化数据：

散景示例图 1：

bokeh example plot 1

散景示例图 2：

bokeh example plot 2

我无法找到一种适用于改变输入和输出的动态方法，例如改变两个输入，同时保持其他输入不变，并通过堆叠条形图可视化对所有输出的影响。此外，该方法应该有效地适应数据更新，而不需要针对每个新场景手动调整代码。

Answer 1

您可以在嵌套循环

中使用

groupby来绘制所有可能的I/O组合：

output_file("stacked_bar_charts.html")

grids = []
for inp_col in inp_cols:
    inplots = []
    for name, sub_df in data_frame.groupby(inp_col):
        for inp_diff in inp_cols.difference([inp_col], sort=False):
            plot = figure(
                title=TITLE(inp_col, name),
                x_range=data_frame[inp_diff].unique(),
                height=H, width=W,
            )

            plot.xaxis.axis_label = inp_diff

            _ = plot.vbar_stack(
                stackers=out_cols, x=inp_diff,
                width=BAR_WIDTH, color=COLORS,
                source=ColumnDataSource(sub_df.sort_values(inp_diff)),
                legend_label=out_cols.tolist(),
            )

            inplots.append(plot)
    grids.append(gridplot(inplots, ncols=NCOLS))

final_layout = row(*grids)

show(final_layout)

输出（

stacked_bar_charts.html"

）：

使用的配置：

import pandas as pd
from bokeh.layouts import gridplot, row
from bokeh.models import ColumnDataSource
from bokeh.plotting import figure, output_file, show

# PD-PREPROCESS
data_frame = pd.read_excel("example.xlsx")

inp_cols = data_frame.filter(like="Input").columns
out_cols = data_frame.filter(like="Output").columns

data_frame = data_frame.astype(dict.fromkeys(inp_cols, str))

# BOKEH-CONFIG
TITLE = "{} = {} fixed".format
COLORS = ["orange", "gray", "brown"] # depends on outputs
BAR_WIDTH = 0.9
H, W = 300, 500
NCOLS = 1

如何使用可变 Excel 输入和输出在 Bokeh 中创建动态堆叠条形图

问题描述投票：0回答：1

1个回答

最新问题

输入_1	输入_2	输出_1	输出_2	输出_3
1	1	100	200	200
2	1	150	150	200
3	1	200	100	200
1	2	200	200	100
2	2	150	200	150
3	2	100	200	200
1	3	200	100	200
2	3	200	150	150
3	3	200	200	100

输入_1	输入_2	输出_1	输出_2	输出_3
1	1	100	200	200
2	1	150	150	200
3	1	200	100	200
1	2	200	200	100
2	2	150	200	150
3	2	100	200	200
1	3	200	100	200
2	3	200	150	150
3	3	200	200	100

如何使用可变 Excel 输入和输出在 Bokeh 中创建动态堆叠条形图

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1

输入_1	输入_2	输出_1	输出_2	输出_3
1	1	100	200	200
2	1	150	150	200
3	1	200	100	200
1	2	200	200	100
2	2	150	200	150
3	2	100	200	200
1	3	200	100	200
2	3	200	150	150
3	3	200	200	100