pandas如何基于df中的其他布尔列创建布尔列

问题描述 投票:1回答:1

我有以下df

inv_date        inv_id
2017-10-01      100117
2018-04-02      040218
2018-05-06      060518

其中inv_datedatetime dtype,而inv_idstr;我想基于以下inv_iddatetime转换为formats

formats = {'%m%d%y': 6, '%d%m%y': 6}
L = [pd.to_datetime(s.str[:v], format=k, errors='coerce') for k, v in formats.items()]
df1 = pd.concat(L, axis=1, keys=[s.name + '_' + str(i) for i, s in zip(count(), L)])
df1 = df.apply(lambda x: x.where(x.between('2000-01-01', datetime.now())))

我想创建一个布尔列dummy_inv_id,如果任何非NaT转换的Truedatetime的+/- 180天内,则设置为inv_date

df1 = df1.assign(inv_date=df['inv_date'])
df1['inv_id_1'].between(df1['inv_date'] - Timedelta(180, unit='d'), df1['inv_date'] + Timedelta(180, unit='d'))
df1['inv_id_2'].between(df1['inv_date'] - Timedelta(180, unit='d'), df1['inv_date'] + Timedelta(180, unit='d'))

我想知道如何在inv_id_1集体考虑所有日期时间列(inv_id_2df1),所以如果有人在inv_date +/- 180 days之间,那么将true分配给df以获得相应的日期时间;

所以结果df看起来像,

inv_date        inv_id    dummy_inv_id
2017-10-01      100117    true
2018-04-02      040218    true
2018-05-06      060518    true
python python-3.x pandas dataframe
1个回答
1
投票

你可以使用np.logical_or.reduce

a = df1['inv_id_1'].between(df1['inv_date'] - pd.Timedelta(180, unit='d'), df1['inv_date'] + pd.Timedelta(180, unit='d'))
b = df1['inv_id_2'].between(df1['inv_date'] - pd.Timedelta(180, unit='d'), df1['inv_date'] + pd.Timedelta(180, unit='d'))

c = [a,b]
df['dummy_inv_id'] = np.logical_or.reduce(c)
print (df)
    inv_date  inv_id  dummy_inv_id
0 2017-10-01  100117          True
1 2018-04-02   40218          True
2 2018-05-06   60518          True
© www.soinside.com 2019 - 2024. All rights reserved.