列级上每行时间戳之间的熊猫差异

问题描述 投票:0回答:1

我有以下数据框:df

                  Date  Type
0  1990-01-01 02:00:00   1
1  1990-01-01 03:00:00   1
2  1990-01-01 04:00:00   1
3  1990-01-01 05:00:00   2
4  1990-01-01 06:00:00   2
5  1990-01-01 07:00:00   2

如何获取新列df ['dt']上每列的时差,以秒为单位,在列级别df ['time']内?以下工作(但不在列级别):

df['dt'] = (df['Date'] - df['Date'].shift(1)).astype('timedelta64[s]')

我如何在列级上使用它?理想情况下,新类型的开头应具有0作为时间差。

pandas datetime pandas-groupby timedelta
1个回答
1
投票

用途:

df = pd.DataFrame() 
df['HH'] = np.arange(0,10) 
start_date ='1990-01-01 00:00:00' 
df['Date'] = pd.to_datetime(df['HH'], unit='h', origin=start_date) 
df['Type'] = [1,1,1,1,1,2,2,2,2,2]

s = df.groupby("Type")['Date'].shift()
df['dt2'] = df['Date'].sub(s).fillna(pd.Timedelta(0)).dt.total_seconds()
print (df)
   HH                Date  Type     dt2
0   0 1990-01-01 00:00:00     1     0.0
1   1 1990-01-01 01:00:00     1  3600.0
2   2 1990-01-01 02:00:00     1  3600.0
3   3 1990-01-01 03:00:00     1  3600.0
4   4 1990-01-01 04:00:00     1  3600.0
5   5 1990-01-01 05:00:00     2     0.0
6   6 1990-01-01 06:00:00     2  3600.0
7   7 1990-01-01 07:00:00     2  3600.0
8   8 1990-01-01 08:00:00     2  3600.0
9   9 1990-01-01 09:00:00     2  3600.0

或:

df['dt2'] = df.groupby("Type")['Date'].diff().fillna(pd.Timedelta(0)).dt.total_seconds()
© www.soinside.com 2019 - 2024. All rights reserved.