我有一个像这样的熊猫数据框 -
ColA ColB ColC
Apple 2019-03-02 18:00:00 Saturday
Orange 2019-03-03 10:00:00 Sunday
Mango 2019-03-04 09:00:00 Monday
我试图根据某些条件从数据框中删除行。
预期的输出在数据帧中不会有芒果。
似乎它比我想象的更难
s1=df.ColB.dt.hour.between(9,17,inclusive=False)
df.loc[s1|df.ColC.isin(['Saturday','Sunday'])]
ColA ColB ColC
0 Apple 2019-03-02 18:00:00 Saturday
1 Orange 2019-03-03 10:00:00 Sunday
或使用
s1=pd.Index(df.ColB).indexer_between_time('09:00:00','17:00:00',include_start =False ,include_end =False)
s1=df.index.isin(s1)
df.loc[s1|df.ColC.isin(['Saturday','Sunday'])]
为了给出另一种选择,你可以像这样写:
cond1 = df.ColB.dt.hour >= 9 # After 09:00
cond2 = df.ColB.dt.hour <= 15 # Before 16:00
cond3 = df.ColB.dt.weekday < 5 # Mon-Fri
df = df[~(cond1&cond2&cond3)]
完整示例:
import pandas as pd
df = pd.DataFrame({
'ColA': ['Apple','Orange','Mango'],
'ColB': pd.to_datetime([
'2019-03-02 18:00:00',
'2019-03-03 10:00:00',
'2019-03-04 09:00:00'
]),
'ColC': ['Saturday', 'Sunday', 'Monday']
})
cond1 = df.ColB.dt.hour >= 9 # After 09:00
cond2 = df.ColB.dt.hour <= 15 # Before 16:00
cond3 = df.ColB.dt.weekday < 5 # Mon-Fri
df = df[~(cond1&cond2&cond3)] # conditions mark the rows to drop, hence ~
print(df)
返回:
ColA ColB ColC
0 Apple 2019-03-02 18:00:00 Saturday
1 Orange 2019-03-03 10:00:00 Sunday