我想给我的csv绘图,但我的数据集包含了以毫秒为单位的unix时间戳。
timeStamp,elapsed,label
1588241066948,438,HTTP Request
1588241066909,490,HTTP Request
1588241066911,470,HTTP Request
1588241066913,461,HTTP Request
1588241066913,461,HTTP Request
1588241066913,460,HTTP Request
1588241066913,460,HTTP Request
1588241066913,460,HTTP Request
1588241066914,476,HTTP Request
1588241066913,478,HTTP Request
1588241066913,461,HTTP Request
目前我的图表是以毫秒为单位绘制的。我不能使用重采样,因为它会掉落标签。最终,我想得到每秒钟或每分钟的第95百分位数,并绘制点。
import plotly.express as px
df = pd.read_csv('demo.csv', low_memory=False)
df['timeStamp'] = pd.to_datetime(df['timeStamp'],unit='ms')
fig = px.line(df, x = 'timeStamp', y = 'elapsed', color = 'label', title='Line Graph')
fig.show()
创建 DatetimeIndex
因此,可以使用 DataFrameGroupBy.quantile
- 与重样链的分组。
df['timeStamp'] = pd.to_datetime(df['timeStamp'],unit='ms')
df1 = (df.set_index('timeStamp')
.groupby('label')['elapsed']
.resample('S')
.quantile(0.95)
.reset_index())
print (df1)
label timeStamp elapsed
0 HTTP Request 2020-04-30 10:04:26 484.0