数据框中每个索引箱的水平条形子图图表

问题描述 投票:0回答:1

我有具有多索引的数据框(按“5min_intervals”和“价格”索引排序)。

                           quantity
5min_intervals       price 
2023-07-27 17:40:00  172.20     330
                     172.19       1
2023-07-27 17:45:00  172.25       4
                     172.24      59
                     172.23     101
                     172.22     224
                     172.21      64
                     172.20     303
                     172.19     740
                     172.18      26
2023-07-27 17:50:00  172.17      30
                     172.16       2
                     172.15    1014
                     172.14     781
                     172.13    1285

我知道带有

df.plot.barh()
的简单水平条形图。我还可以通过“5min_intervals”索引迭代数据帧

for date in df.index.levels[0]:
    print(df.loc[date])

并获取每个“5min_intervals”索引的数据帧,如下所示

        quantity
price           
172.20       330
172.19         1

有没有办法用 matplotlib 创建一个图表,其中每个“5min_intervals”索引都是水平条形图。大致如下图所示

python pandas matplotlib multi-index
1个回答
0
投票

让我们准备一些要使用的虚拟数据:

from pandas import DataFrame, Timestamp

data = {
    'quantity': {
        (Timestamp('2023-07-27 17:40:00'), 172.2): 330,
        (Timestamp('2023-07-27 17:40:00'), 172.19): 1,
        (Timestamp('2023-07-27 17:45:00'), 172.25): 4,
        (Timestamp('2023-07-27 17:45:00'), 172.24): 59,
        (Timestamp('2023-07-27 17:45:00'), 172.23): 101,
        (Timestamp('2023-07-27 17:45:00'), 172.22): 224,
        (Timestamp('2023-07-27 17:45:00'), 172.21): 64,
        (Timestamp('2023-07-27 17:45:00'), 172.2): 303,
        (Timestamp('2023-07-27 17:45:00'), 172.19): 740,
        (Timestamp('2023-07-27 17:45:00'), 172.18): 26,
        (Timestamp('2023-07-27 17:50:00'), 172.17): 30,
        (Timestamp('2023-07-27 17:50:00'), 172.16): 2,
        (Timestamp('2023-07-27 17:50:00'), 172.15): 1014,
        (Timestamp('2023-07-27 17:50:00'), 172.14): 781,
        (Timestamp('2023-07-27 17:50:00'), 172.13): 1285
    }
}

df = DataFrame(data)

至于图形,包装盒中几乎没有类似的东西。所以我们需要手动构建它。首先,我们先做一些基本的准备:

unique_time = df.index.get_level_values(0).unique().sort_values()
unique_price = df.index.get_level_values(1).unique().sort_values()

spacing = 0.2   # a minimum distance between two consecutive horizontal lines
values = (1-spacing) * df/df.max()   # relative lengths of horizontal lines
base = DataFrame(index=unique_price)   # the widest blank frame with prices

为了构建图形,我们可以使用

barh(y, width, height, left)
- 水平条 - 将价格作为第一个参数,将时间作为左移,用值代替宽度,并固定小高度。为了使非常小的值可见,我们可以再次在
barh
的帮助下用小勾号额外标记线的乞讨(左端)。

import matplotlib.pyplot as plt
from matplotlib.colors import TABLEAU_COLORS
from itertools import cycle

fig, ax = plt.subplots(figsize=(7,7))

xlabels = unique_time.astype(str)
ylabels = unique_price.astype(str)

ax.set_xticks(range(len(xlabels)), xlabels, rotation=45)
ax.set_yticks(range(len(ylabels)), ylabels)
ax.set_xlim([-0.5,len(xlabels)])
ax.set_ylim([-1, len(ylabels)])

for i, (t, c) in enumerate(zip(unique_time, cycle(TABLEAU_COLORS))):
    pr = base.join(values.loc[t]).squeeze()
    # draw horizontal lines, shifted left by i-th timepoint
    ax.barh(ylabels, pr, 0.1, i, color=c)
    # put tics at the left end of lines, i.e. their beginning
    ax.barh(ylabels, 0.01*pr.notna(), 0.3, i, color=c)

ax.grid(axis='y', linestyle='--', linewidth=0.5)
fig.tight_layout()
plt.show()

这是输出:

© www.soinside.com 2019 - 2024. All rights reserved.