这里是一个示例数据框:
datetime temp T1 T2 T3 T4 T5
115 2020-01-04 02:53:00+00:00 58 0 0 0 0 0
116 2020-01-04 03:53:00+00:00 51 0 0 0 0 0
117 2020-01-04 04:53:00+00:00 49 0 0 0 0 0
118 2020-01-04 05:53:00+00:00 48 0 0 0 0 0
119 2020-01-04 06:00:00+00:00 48 0 0 0 0 0
120 2020-01-04 06:53:00+00:00 47 0 0 0 0 0
这是我想要的输出是:
datetime temp T1 T2 T3 T4 T5
115 2020-01-04 02:53:00+00:00 58 0 0 0 0 0
116 2020-01-04 03:53:00+00:00 51 58 0 0 0 0
117 2020-01-04 04:53:00+00:00 49 51 58 0 0 0
118 2020-01-04 05:53:00+00:00 48 49 51 58 0 0
119 2020-01-04 06:00:00+00:00 48 48 49 51 58 0
120 2020-01-04 06:53:00+00:00 47 48 48 49 51 58
Series.shift
我们也可以使用for col in df.columns[df.columns.str.contains('T')]:
df[col] = df['temp'].shift(int(col[-1]),fill_value = 0)
print(df)
pd.Index.difference
输出
pd.Index.difference
用途:
for col in df.columns.difference(['datetime','temp']):
df[col] = df['temp'].shift(int(col[-1]),fill_value = 0)
对于最后的熊猫版本:
datetime temp T1 T2 T3 T4 T5
115 2020-01-04-02:53:00+00:00 58 0 0 0 0 0
116 2020-01-04-03:53:00+00:00 51 58 0 0 0 0
117 2020-01-04-04:53:00+00:00 49 51 58 0 0 0
118 2020-01-04-05:53:00+00:00 48 49 51 58 0 0
119 2020-01-04-06:00:00+00:00 48 48 49 51 58 0
120 2020-01-04-06:53:00+00:00 47 48 48 49 51 58
for i in range(1,len(df)):
df[f'T{i}'] = df['temp'].shift(i).fillna(0).astype(int)