时间戳列表,按每小时多少个分组

问题描述 投票:0回答:2

我有一个 json 时间戳列表:

[
  "2024-03-27 00:30:30.321000",
  "2024-03-27 00:34:58.695000",
  "2024-03-27 00:37:38.352000",
  "2024-03-27 00:37:40.419000",
  "2024-03-27 00:43:54.536000",
  "2024-03-27 00:49:39.231000",
  "2024-03-27 01:03:39.637000",
  "2024-03-27 01:05:24.370000",
  "2024-03-27 01:17:43.586000",
  "2024-03-27 01:17:47.447000",
  "2024-03-27 01:17:59.913000",
  "2024-03-27 01:18:34.872000",
  "2024-03-27 01:18:36.922000",
  "2024-03-27 01:18:44.626000",
  "2024-03-27 01:19:11.057000",
  "2024-03-27 01:19:12.307000",
  "2024-03-27 01:21:11.322000",
  "2024-03-27 01:26:54.640000",
  "2024-03-27 01:26:55.055000",
  ...

我希望绘制它们的频率,例如每小时。我可以让它与 pandas 一起使用,但这需要我添加一个虚拟列:

[
  {
    "foo": 1,
    "ts": "2024-03-27 00:24:13.132000"
  },
  {
    "foo": 1,
    "ts": "2024-03-27 00:30:30.321000"
  },
  {
    "foo": 1,
    "ts": "2024-03-27 00:34:58.695000"
  },
  {
    "foo": 1,
    "ts": "2024-03-27 00:36:04.166000"
  },
  {
    "foo": 1,
    "ts": "2024-03-27 00:37:38.352000"
  },
  {
    "foo": 1,
    "ts": "2024-03-27 00:37:40.419000"
  },
  {
    "foo": 1,
    "ts": "2024-03-27 00:43:54.536000"
  },
 ....
]

所以我可以使用

sum()
:

import sys
import pandas as pd

freq = '1d'
df = pd.read_json(sys.stdin)
df['ts'] = pd.to_datetime(df['ts'])
overview = df.resample(freq, on='ts').foo.sum()
print(overview)

这给出了我正在寻找的东西:

2024-03-27      674
2024-03-28      405
2024-03-29      366
2024-03-30      352
2024-03-31      541
2024-04-01      657
2024-04-02      398
2024-04-03      523
2024-04-04      466
2024-04-05      498
2024-04-06      468
2024-04-07      312
2024-04-08      453
2024-04-09      625
2024-04-10      654
2024-04-11      696
2024-04-12      624
2024-04-13      377
2024-04-14      304
2024-04-15      493
2024-04-16      544
2024-04-17      526

我可以在没有虚拟列的情况下执行此操作吗?那么只需使用简单的时间戳列表作为输入?

python pandas date datetime
2个回答
0
投票

IIUC,你可以使用

resample.size
,你不需要虚拟的“foo”列:

df.resample(freq, on='ts').size()

0
投票

使用您可以使用的示例数据:

data = [
  "2024-03-27 00:30:30.321000",
  "2024-03-27 00:34:58.695000",
  "2024-03-27 00:37:38.352000",
  "2024-03-27 00:37:40.419000",
  "2024-03-27 00:43:54.536000",
  "2024-03-27 00:49:39.231000",
  "2024-03-27 01:03:39.637000",
  "2024-03-27 01:05:24.370000",
  "2024-03-27 01:17:43.586000",
  "2024-03-27 01:17:47.447000",
  "2024-03-27 01:17:59.913000",
  "2024-03-27 01:18:34.872000",
  "2024-03-27 01:18:36.922000",
  "2024-03-27 01:18:44.626000",
  "2024-03-27 01:19:11.057000",
  "2024-03-27 01:19:12.307000",
  "2024-03-27 01:21:11.322000",
  "2024-03-27 01:26:54.640000",
  "2024-03-27 01:26:55.055000"]

df = pd.Series(data=pd.to_datetime(data))
freq = df.groupby([df.dt.floor('1h')]).count()
print(freq)

代码产生:

2024-03-27 00:00:00     6
2024-03-27 01:00:00    13
© www.soinside.com 2019 - 2024. All rights reserved.