我正在运行这段代码,并且在 1677 年之前的日期我猜会有
OutOfBoundsDatetime
的问题。
我的代码是
import pandas as pd
df = pd.DataFrame({'datetime_str': ['2011-01-17 23:20:00' ,'0031-01-17 23:20:00']})
df['datetime_str'] = (pd.to_datetime(df['datetime_str']).astype(int) / 10 ** 6).astype(int)
现在我想指定最短日期以防发生此错误。我正在使用此代码实现此目的
import pandas as pd
import numpy as np
df = pd.DataFrame({'datetime_str': ['2011-01-17 23:20:00', '0031-01-17 23:20:00']})
# convert the datetime string to epoch time
epoch_time = []
for dt_str in df['datetime_str']:
try:
epoch_time.append(int(pd.to_datetime(dt_str).timestamp()))
except pd.errors.OutOfBoundsDatetime:
epoch_time.append(int(pd.Timestamp('1970-09-21 00:12:43.145225').timestamp()))
df['epoch_time'] = epoch_time
print(df['epoch_time'])
我能够实现我的目标,但我认为这不是处理熊猫的最佳方式,因为它会遍历所有内容,我想以毫秒为单位保存纪元。有没有更好的办法?
我想这就是你想要的:
df = pd.DataFrame({'datetime_str': ['2011-01-17 23:20:00' ,'0031-01-17 23:20:00']})
df['epoch_time'] = pd.to_datetime(df['datetime_str'], errors="coerce") #instead of raising error we just fill it with NaTs
df['epoch_time'].fillna(pd.Timestamp('1970-09-21 00:12:43.145225'), inplace=True) #then we fill NAs