我创建了一个 Pandas DataFrame 并使用
.hist()
绘制了它。
我希望能够在同一个图形上绘制直线/曲线。
我该怎么做?
我能够使用
df.hist(column='Example', bins=15)
将数据绘制为直方图。这将返回一个 axis
对象。
我想我也许可以使用
ax=axis
作为参数来绘制一条线。但这是无效的。
看起来
plt.plot
采用不同的 kwargs
到 DataFrame.hist
。使用 .hist()
与参数 ax=axis
相结合,可以将 DataFrame 中的多组数据绘制在同一个图形上(作为直方图)。
这里是一些示例代码,取自 Jupyter Notebook,以及一些可供使用的数据。
data = [211995, 139950, 202995, 223000, 184995, 82000, 127000, 240000, 116000, 74500, 151000, 149000, 290000, 146000, 174500, 418000, 150000, 150000, 260000, 100000, 282500, 510000, 142000, 382000, 220000, 259000, 330000, 177500, 290000, 280000, 118000, 97000, 124000, 385000, 199950, 90000, 135000, 395000, 182000, 105000, 80000, 230000, 227950, 176995, 110000, 142000, 132500, 100000, 95000, 257500, 186000, 230000, 169995, 167995, 119950, 119950, 361000, 125000, 242000, 240000, 205000, 187500, 180000, 146000, 257995, 380000, 144995, 139995, 159995, 265000, 288000, 288000, 162500, 290000, 182737, 235000, 250000, 175000, 153000, 125000, 170000, 165000, 187995, 250000, 220000, 108750, 125000, 245000, 100000, 130000, 115000, 218000, 190000, 435000, 300000, 465000, 179950, 259500, 187000, 200000]
import pandas as pd
import matplotlib.pyplot as plt
import numpy
df = pd.DataFrame(example_data)
df.columns = ['Example']
axis = df.hist(column='Example', bins = 15)
x = numpy.linspace(1e5, 5e5, 20)
def f(x):
return x * numpy.exp(-x)
y = f(x)
plt.plot(x, y, axis)
就是这样。显然轴 kwarg 不是必需的。
plt.plot(x, y)
plt.show()
完整示例:
df = pd.DataFrame(example_data)
df.columns = ['Example']
axis = df.hist(column='Example', bins = 15)
x = numpy.linspace(1e5, 5e5, 20)
def f(x):
return 1.0e-5 * x * numpy.exp(-1.0e-5 * x)
y = f(x)
plt.plot(x, y)
plt.show()
pandas.DataFrame.hist
,返回其中的 matplotlib.AxesSubplot
或 numpy.ndarray
pandas.DataFrame.plot.hist
,返回 matplotlib.AxesSubplot
pandas.DataFrame.plot
与 kind='hist'
,返回 matplotlib.axes.Axes
或其中的 numpy.ndarray
plt.plot(x, y)
只会绘制到最后一个 axes
。axes
接口,而不是切换到 隐式 pyplot
接口,按照 为什么要显式?import pandas as pd
import seaborn as sns # for data
import matplotlib.pyplot as plt
df = sns.load_dataset('penguins')
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex
0 Adelie Torgersen 39.1 18.7 181.0 3750.0 Male
1 Adelie Torgersen 39.5 17.4 186.0 3800.0 Female
2 Adelie Torgersen 40.3 18.0 195.0 3250.0 Female
3 Adelie Torgersen NaN NaN NaN NaN NaN
4 Adelie Torgersen 36.7 19.3 193.0 3450.0 Female
df.hist
axes = df.hist(['bill_length_mm', 'bill_depth_mm'])
# returns an array of axes
axes → array([[<Axes: title={'center': 'bill_length_mm'}>, <Axes: title={'center': 'bill_depth_mm'}>]], dtype=object)
# draws first axes
axes[0][0].axhline(y=30, c='r')
# draws on second axes
axes[0][1].axhline(y=30, c='purple')
# only drawn an the last plot
plt.axhline(y=20, c='k')
# with only a single column
axes = df.hist('bill_length_mm')
# still an array with a single axes
axes → array([[<Axes: title={'center': 'bill_length_mm'}>]], dtype=object)
# index to the axes
axes[0][0].axhline(y=30, c='r')
# switching to the implicit interface works, but is not the recommended way
plt.axhline(y=20, c='k')
df.plot.hist
ax = df['bill_length_mm'].plot.hist()
# or
ax = df.plot.hist()
# or
ax = df.plot.hist(y='bill_length_mm')
# return a single axes
ax → <Axes: ylabel='Frequency'>
# plot on the axes
ax.axhline(y=30, c='k')
axes = df.plot.hist(by='sex')
# returns an array of axes
axes → array([<Axes: title={'center': 'Female'}, ylabel='Frequency'>, <Axes: title={'center': 'Male'}, ylabel='Frequency'>], dtype=object)
# draws first axes
axes[0].axhline(y=30, c='r')
# draws on second axes
axes[1].axhline(y=30, c='purple')
# only drawn an the last plot
plt.axhline(y=20, c='k')
df.plot(kind='hist')
ax = df['bill_length_mm'].plot(kind='hist')
# or
ax = df.plot(kind='hist')
# or
ax = df.plot(kind='hist', y='bill_length_mm')
# return a single axes
ax → <Axes: ylabel='Frequency'>
# plot on the axes
ax.axhline(y=30, c='k')
axes = df.plot(kind='hist', by='sex')
# returns an array of axes
axes → array([<Axes: title={'center': 'Female'}, ylabel='Frequency'>, <Axes: title={'center': 'Male'}, ylabel='Frequency'>], dtype=object)
# draws first axes
axes[0].axhline(y=30, c='r')
# draws on second axes
axes[1].axhline(y=30, c='purple')
# only drawn an the last plot
plt.axhline(y=20, c='k')