我有以下数据框:df
Date Type
0 1990-01-01 02:00:00 1
1 1990-01-01 03:00:00 1
2 1990-01-01 04:00:00 1
3 1990-01-01 05:00:00 2
4 1990-01-01 06:00:00 2
5 1990-01-01 07:00:00 2
如何获取新列df ['dt']上每列的时差,以秒为单位,在列级别df ['time']内?以下工作(但不在列级别):
df['dt'] = (df['Date'] - df['Date'].shift(1)).astype('timedelta64[s]')
我如何在列级上使用它?理想情况下,新类型的开头应具有0作为时间差。
用途:
df = pd.DataFrame()
df['HH'] = np.arange(0,10)
start_date ='1990-01-01 00:00:00'
df['Date'] = pd.to_datetime(df['HH'], unit='h', origin=start_date)
df['Type'] = [1,1,1,1,1,2,2,2,2,2]
s = df.groupby("Type")['Date'].shift()
df['dt2'] = df['Date'].sub(s).fillna(pd.Timedelta(0)).dt.total_seconds()
print (df)
HH Date Type dt2
0 0 1990-01-01 00:00:00 1 0.0
1 1 1990-01-01 01:00:00 1 3600.0
2 2 1990-01-01 02:00:00 1 3600.0
3 3 1990-01-01 03:00:00 1 3600.0
4 4 1990-01-01 04:00:00 1 3600.0
5 5 1990-01-01 05:00:00 2 0.0
6 6 1990-01-01 06:00:00 2 3600.0
7 7 1990-01-01 07:00:00 2 3600.0
8 8 1990-01-01 08:00:00 2 3600.0
9 9 1990-01-01 09:00:00 2 3600.0
或:
df['dt2'] = df.groupby("Type")['Date'].diff().fillna(pd.Timedelta(0)).dt.total_seconds()