Python:查找两个日期的平均小数年,同时忽略 NaT 值

问题描述 投票:0回答:1

我有两个日期时间数组:

index
0 2009-07-03
1 2009-07-03
2 2009-07-03
    ...
216426 2003-02-07
216427 2004-04-09
216428 NaT

index
0 NaT
1 NaT
2 2015-04-12
    ...
216426 2013-09-17
216427 2014-02-19
216428 NaT

如何计算两个数组的平均小数年份?如果两个数组之一都是 NaT,则仅获取该索引处的非 NaT 项。如果两个索引都是 NaT,则返回 NaT。

python datetime
1个回答
0
投票

可能的解决方案如下:

import pandas as pd
import numpy as np

df1 = pd.DataFrame({'index': ['2009-07-03', '2009-07-03', '2009-07-03']})
df2 = pd.DataFrame({'index': ['NaT', '2010-07-03', '2020-07-03']})

df1['index'] = pd.to_datetime(df1['index'])
df2['index'] = pd.to_datetime(df2['index'])

# Convert each date to a decimal year
df1['year'] = df1['index'].dt.year + df1['index'].dt.dayofyear / 365.25
df2['year'] = df2['index'].dt.year + df2['index'].dt.dayofyear / 365.25

df_years = pd.concat([df1['year'], df2['year']], axis=1)

df_years['average'] = df_years.mean(axis=1, skipna=True)

df_years['average'] = df_years['average'].replace({np.nan: pd.NaT})

print(df_years)

您可以看到原始数据框中的 NaT 值已替换为 NaN

上面代码的输出是:

          year         year      average
0  2009.503765          NaN  2009.503765
1  2009.503765  2010.503765  2010.003765
2  2009.503765  2020.506502  2015.005133
© www.soinside.com 2019 - 2024. All rights reserved.