我有两个日期时间数组:
index
0 2009-07-03
1 2009-07-03
2 2009-07-03
...
216426 2003-02-07
216427 2004-04-09
216428 NaT
index
0 NaT
1 NaT
2 2015-04-12
...
216426 2013-09-17
216427 2014-02-19
216428 NaT
如何计算两个数组的平均小数年份?如果两个数组之一都是 NaT,则仅获取该索引处的非 NaT 项。如果两个索引都是 NaT,则返回 NaT。
可能的解决方案如下:
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'index': ['2009-07-03', '2009-07-03', '2009-07-03']})
df2 = pd.DataFrame({'index': ['NaT', '2010-07-03', '2020-07-03']})
df1['index'] = pd.to_datetime(df1['index'])
df2['index'] = pd.to_datetime(df2['index'])
# Convert each date to a decimal year
df1['year'] = df1['index'].dt.year + df1['index'].dt.dayofyear / 365.25
df2['year'] = df2['index'].dt.year + df2['index'].dt.dayofyear / 365.25
df_years = pd.concat([df1['year'], df2['year']], axis=1)
df_years['average'] = df_years.mean(axis=1, skipna=True)
df_years['average'] = df_years['average'].replace({np.nan: pd.NaT})
print(df_years)
您可以看到原始数据框中的 NaT 值已替换为 NaN。
上面代码的输出是:
year year average
0 2009.503765 NaN 2009.503765
1 2009.503765 2010.503765 2010.003765
2 2009.503765 2020.506502 2015.005133