如何在同一箱线图中显示单个和组合类别

问题描述 投票:0回答:3

我想要一个图,在 9 个类别到 2 个时间点的箱线图下方显示一个额外的箱线图。

我已经做过的事情:

创建一个 pandas df.

data = {'Category':  ['waschen','anziehen','lesen', 'putzen', 'zahnpflege', 'essen', 'hobby', 'schlafen', 'spazieren', 'waschen',
                      'anziehen','lesen', 'putzen', 'zahnpflege', 'essen', 'hobby', 'schlafen', 'spazieren'],
        'T1': ['1', '6', '5','8', '4', '7', '5', '7', '1', '7', '3', '2', '1', '4', '7', '5', '7', '1'],
         'T2':['3', '7', '7','9', '8', '10', '8', '9', '3', '10', '9', '5', '3', '8', '9', '6', '7', '5']}

df = pd.DataFrame(data)

创建箱线图并根据我的喜好对类别进行排序。

sns.boxplot(y='Category', x='value', hue='variable', 
            data=df.melt(id_vars='Category', var_name='variable', value_name='value'),
           palette='Blues',
           order=['waschen', 'anziehen', 'zahnpflege', 'putzen', 'schlafen', 'essen', 'lesen', 'hobby', 'spazieren'])
plt.show()

绘制整个类别的 2 个时间点的箱线图。

sns.boxplot(x= 'value', y='variable',
            data=df.melt(var_name='variable', value_name='value'),
            palette='Reds')

现在我想把这两个地块放在一个地块里。因为我想显示与各个类别值相关的总体值(x 轴相同)。 seaborn 有可能还是我应该使用 matplotlib 子图?

我的想法是创建一个有 2 个轴的图形,我尝试了这个:

fig, axes = plt.subplots(2, 1) sns.boxplot((data=df_gesamt,
orient='h'), ax=axes[0,0])

但是报错:

Input In [13]
    sns.boxplot((data=df_gesamt, orient='h'), ax=axes[0,0])
                     ^
SyntaxError: invalid syntax
python pandas matplotlib seaborn boxplot
3个回答
1
投票

将列转换为数字后,您可以使用以下技巧将两个图结合起来:

  • order
    中添加一个额外的标签来放置整体箱线图
  • 使用虚拟
    y
    重复该标签的次数与长数据框中的行数一样多

可选地,您可以组合图例。

import matplotlib.pyplot as plt
from matplotlib.legend_handler import HandlerTuple
import seaborn as sns
import pandas as pd

data = {'Category': ['waschen', 'anziehen', 'lesen', 'putzen', 'zahnpflege', 'essen', 'hobby', 'schlafen', 'spazieren',
                     'waschen', 'anziehen', 'lesen', 'putzen', 'zahnpflege', 'essen', 'hobby', 'schlafen', 'spazieren'],
        'T1': ['1', '6', '5', '8', '4', '7', '5', '7', '1', '7', '3', '2', '1', '4', '7', '5', '7', '1'],
        'T2': ['3', '7', '7', '9', '8', '10', '8', '9', '3', '10', '9', '5', '3', '8', '9', '6', '7', '5']}
df = pd.DataFrame(data)
df['T1'] = df['T1'].astype(float)
df['T2'] = df['T2'].astype(float)

together_val = '--zusammen--'
order = ['waschen', 'anziehen', 'zahnpflege', 'putzen', 'schlafen', 'essen', 'lesen', 'hobby',
         'spazieren'] + [together_val]

df_long = df.melt(id_vars='Category', var_name='variable', value_name='value')
ax = sns.boxplot(y='Category', x='value', hue='variable',
                 data=df_long,
                 palette='Blues',
                 order=order)
sns.boxplot(y=[together_val] * len(df_long), x='value', hue='variable',
            data=df_long,
            palette='Reds',
            order=order,
            ax=ax)
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles=[tuple(handles[i::2]) for i in range(len(handles) // 2)], labels=labels[:2],
          loc='upper left', bbox_to_anchor=(1.01, 1.01), handler_map={tuple: HandlerTuple(ndivide=2)})
ax.set_xlabel('')  # optionally remove the x label
ax.set_ylabel('')
sns.despine()
plt.tight_layout()
plt.show()


0
投票

更改列类型,对我有用:

df.T1 = df.T1.astype(int)
df.T2 = df.T2.astype(int)
sns.boxplot(y='Category', x='value', hue='variable', 
            data=df.melt(id_vars='Category', var_name='variable', value_name='value'),
           palette='Blues',
           order=['waschen', 'anziehen', 'zahnpflege', 'putzen', 'schlafen', 'essen', 'lesen', 'hobby', 'spazieren'])

0
投票
  • answer 所示,使用
    pandas.DataFrame.assign
    创建一个
    combined
    DataFrame,其中
    'Category'
    'Observation'
    列值被唯一命名。
  • combined
    pandas.concat
    连接到原始数据的长格式,并用
    seaborn
    绘制。
  • 测试于
    python 3.11
    pandas 1.5.3
    matplotlib 3.7.0
    seaborn 0.12.2

管理数据

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# beginning with the sample data in the OP
data = {'Category': ['waschen', 'anziehen', 'lesen', 'putzen', 'zahnpflege', 'essen', 'hobby', 'schlafen', 'spazieren', 'waschen', 'anziehen', 'lesen', 'putzen', 'zahnpflege', 'essen', 'hobby', 'schlafen', 'spazieren'],
        'T1': ['1', '6', '5', '8', '4', '7', '5', '7', '1', '7', '3', '2', '1', '4', '7', '5', '7', '1'],
        'T2': ['3', '7', '7', '9', '8', '10', '8', '9', '3', '10', '9', '5', '3', '8', '9', '6', '7', '5']}
df = pd.DataFrame(data)

# convert the dataframe to a long form
df = df.melt(id_vars='Category', var_name='Observation', value_name='Value')

# convert the Value column to int or float as needed
df.Value = df.Value.astype(int)

# create the Category plot order
new_cat = 'Combined'
order = ['waschen', 'anziehen', 'zahnpflege', 'putzen', 'schlafen', 'essen', 'lesen', 'hobby', 'spazieren'] + [new_cat]

# create a new dataframe with Category as Combined, and the Observation value is updated "if desired"
combined = df.assign(Category=new_cat, Observation=lambda x: x['Observation'] + ' - ' + x['Category'])
# combined = df.assign(Category=new_cat)  # without updating Observation

# concat the two dataframes
combined = pd.concat([df, combined], ignore_index=True)

绘制 seaborn 轴级图

fig, ax = plt.subplots(figsize=(12, 8))
sns.boxplot(data=combined, y='Category', x='Value', hue='Observation', order=order, ax=ax)
sns.move_legend(ax, bbox_to_anchor=(1, 1), loc='upper left', frameon=False)

fig, ax = plt.subplots(figsize=(12, 8))
sns.boxplot(data=combined, y='Category', x='Value', hue='Observation', order=order, ax=ax)
# set the 'Combined' ytick label to as red
ax.get_ymajorticklabels()[-1].set_color('red')
sns.move_legend(ax, bbox_to_anchor=(1, 1), loc='upper left', frameon=False)

绘制seaborn 人物级情节

g = sns.catplot(data=combined, kind='box', y='Category', x='Value', hue='Observation', order=order, height=7, aspect=1.5)

g = sns.catplot(data=combined, kind='box', y='Category', x='Value', hue='Observation', order=order, height=7, aspect=1.5)
# get the axes from the FacetGrid
ax1 = g.axes.flat[0]
# set the 'Combined' ytick label to as red
ax1.get_ymajorticklabels()[-1].set_color('red')

© www.soinside.com 2019 - 2024. All rights reserved.