我有以下代码:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Sample data (replace this with your actual DataFrame)
data = {
'CU': [1.5, 2.3, 1.8, 3.2, 2.5, 2.0, 3.8, 3.0],
'ER': [0.2, 0.5, np.nan, 0.7, 0.8, 0.4, 0.9, 0.6],
}
df = pd.DataFrame(data)
print(data)
# Create a new column 'Valid_CU' where CU values are replaced with NaN if ER is NaN
df['Valid_CU'] = df['CU'].where(~df['ER'].isna())
hist_values, bin_edges,something= plt.hist([df['CU'], df['Valid_CU']], bins= 3,label=['Total Data', 'Valid Data'])
plt.xlabel('CU Values')
plt.ylabel('Frequency')
plt.title('Histogram of CU Values with Valid Data Counts')
plt.legend()
print("hist",hist_values,"bind edges",bin_edges,"some",something)
# Show the plot
plt.show()
这样我得到:
这非常好!我尝试用 Plotly 做同样的事情,但没有成功。绘图直方图给出了丑陋的表示,最糟糕的是所有值都是错误的。
如何在 Plotly 中执行上述操作?
(请注意,其中的垃圾箱是可定制的。我想在 Plotly 中实现同样的功能)。
我宁愿下面的代码不影响任何答案。我把它写成报告。我最多能做的是:
fig = go.Figure()
fig.add_trace(go.Bar(x=df['CU'], y=[1] * len(df),name='Total Data'))
fig.add_trace(go.Bar(x=df['Valid_CU'], y=[1] * len(df), name='Valid Data'))
fig.update_layout(
title_text='Count of CU Values with Valid Data Counts',
xaxis_title_text='CU Values',
yaxis_title_text='Count',
# barmode='overlay', # overlay bars
barmode='group'
)
# Show the plot
fig.show()
但是这里的bin数量是不可定制的。
go.Histogram
?
您可以使用
nbinsx
设置垃圾箱的数量:
from plotly import graph_objects as go
fig = go.Figure()
fig.add_trace(go.Histogram(x=df["CU"], nbinsx=3, name='Total Data'))
fig.add_trace(go.Histogram(x=df["Valid_CU"], nbinsx=3, name='Valid Data'))
fig.update_layout(
title_text='Count of CU Values with Valid Data Counts',
xaxis_title_text='CU Values',
yaxis_title_text='Count',
# barmode='overlay', # overlay bars
barmode='group'
)
# Show the plot
fig.show()
输出:
如您所见,垃圾箱与 matplotlib 的垃圾箱不同。不过,您可以自己定义垃圾箱:
xmin = df['CU'].min() * 0.99
xmax = df['CU'].max() * 1.01
nbins = 3
xbins=go.histogram.XBins(size=(xmax-xmin)/nbins, start=xmin, end=xmax)
fig.add_trace(go.Histogram(x=df["CU"], xbins=xbins, name='Total Data'))
fig.add_trace(go.Histogram(x=df["Valid_CU"], xbins=xbins, name='Valid Data'))
输出: