这是我原始的大型(7000多行)数据框的一小段,按日期索引,包含洪水大小(大小)和降水(ppt)的列:
Size ppt
date
2017-09-11 0.0 0.000000
2017-09-12 0.0 0.000000
2017-09-13 0.0 0.000000
2017-09-14 1.0 34.709998
2017-09-15 0.0 0.000000
2017-09-16 0.0 0.000000
2017-09-17 0.0 0.000000
2017-09-18 0.0 0.600000
2017-09-19 3.0 157.439998
我已经使用下面的代码将它分成我想要比较的组,即“洪水日降雨量(当大小= 1,2或3,ppt> = 0)”时,“没有洪水的天数降雨量”(大小= 0,ppt> 0),然后删除没有下雨或洪水发生的日子(Size = 0,ppt = 0)。
#initial separation of data
mask = df1['Size'].eq(0)
dfFl = df1[~mask] #Days with floods
dfnFl = df1[mask] #Days without floods i.e Size=0
# remove days with no rain or flood.
mask = df1['ppt3'].eq(0)
dfnFl = df1[~mask] #Days with rain but no flood
dfnil = df1[mask] #Days with no flood or rain
使用我的数据帧的这个片段,此过程返回:
#dfFl (days with flood):
Size ppt
date
2017-09-14 1.0 34.709998
2017-09-19 3.0 157.439998
#dfnFl (days with rainfall but no flood):
Size ppt
date
2017-09-18 0.0 0.600000
#dfnil (days with no rain nor flood):
Size ppt
date
2017-09-11 0.0 0.000000
2017-09-12 0.0 0.000000
2017-09-13 0.0 0.000000
2017-09-15 0.0 0.000000
2017-09-16 0.0 0.000000
2017-09-17 0.0 0.000000
2017-09-18 0.0 0.600000
我想通过在一个简单的方框图中查看它们来比较这些组(dfFl和dfnFl):
fig, axs = plt.subplots(2, 3)
axs[0, 0].boxplot(dfFl['ppt'], dfnFl['ppt'])
plt.show()
但是,当我尝试这样做时,我收到以下错误:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-39-2b9c443a4940> in <module>()
2
3 fig, axs = plt.subplots(2, 3)
----> 4 axs[0, 0].boxplot(dfFl['ppt'], dfnFl['ppt'])
5 plt.show()
~/anaconda3/lib/python3.6/site-packages/matplotlib/__init__.py in inner(ax,
*args, **kwargs)
1708 warnings.warn(msg % (label_namer, func.__name__),
1709 RuntimeWarning, stacklevel=2)
-> 1710 return func(ax, *args, **kwargs)
1711 pre_doc = inner.__doc__
1712 if pre_doc is None:
~/anaconda3/lib/python3.6/site-packages/matplotlib/axes/_axes.py in
boxplot(self, x, notch, sym, vert, whis, positions, widths, patch_artist,
bootstrap, usermedians, conf_intervals, meanline, showmeans, showcaps, showbox,
showfliers, boxprops, labels, flierprops, medianprops, meanprops, capprops,
whiskerprops, manage_xticks, autorange, zorder)
3443 meanline=meanline, showfliers=showfliers,
3444 capprops=capprops, whiskerprops=whiskerprops,
-> 3445 manage_xticks=manage_xticks, zorder=zorder)
3446 return artists
3447
~/anaconda3/lib/python3.6/site-packages/matplotlib/axes/_axes.py in bxp(self,
bxpstats, positions, widths, vert, patch_artist, shownotches, showmeans,
showcaps, showbox, showfliers, boxprops, whiskerprops, flierprops, medianprops,
capprops, meanprops, meanline, manage_xticks, zorder)
3773
3774 # notched boxes
-> 3775 if shownotches:
3776 box_x = [box_left, box_right, box_right, cap_right,
box_right,
3777 box_right, box_left, box_left, cap_left,
box_left,
~/anaconda3/lib/python3.6/site-packages/pandas/core/generic.py in
__nonzero__(self)
953 raise ValueError("The truth value of a {0} is ambiguous. "
954 "Use a.empty, a.bool(), a.item(), a.any() or
a.all()."
--> 955 .format(self.__class__.__name__))
956
957 __bool__ = __nonzero__
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(),
a.item(), a.any() or a.all().
我真的不明白这里出了什么问题,因为当我查看过滤后的数据帧时,它们看起来很正常(如上所示)。有什么想法吗?
谢谢
boxplot期望单个参数x
(axs[0, 0].boxplot(x)
)。但是,您提供了两个参数。这当然会失败,因为如果要显示缺口,第二个参数被解释为有资格,因此应该采用True
或False
。
看起来你想要绘制两个箱形图,
axs[0, 0].boxplot(dfFl['ppt'])
axs[0, 0].boxplot(dfnFl['ppt'])
要么
axs[0, 0].boxplot([dfFl['ppt'],dfnFl['ppt']])