我是 python 和 pandas 的新手,所以请原谅我这个看似简单的问题。我正在使用包含员工数据的本地文件。我想使用 TimeFrame 来过滤给定日期范围的特定列。我怎样才能创建一个函数或变量,而不是手动更新 . Between() 的 4 个日期范围,我可以使其类似于:
date = ("6/1/2023, 6/30/2023") 然后将其放在 . Between(date) 中。
我觉得这很简单,但我做错了一些事情,所以它一直给我错误。
TimeFrame = (WFRL["EOD (HIRE_DT)"].between("6/1/2023", "6/30/2023") |
WFRL["Grade Entry Date"].between("6/1/2023", "6/30/2023") |
WFRL["Step Date"].between("6/1/2023", "6/30/2023") |
WFRL["WGI Due Dt"].between("6/1/2024", "6/30/2024"))
我犯了某种基本错误,因为不同的尝试都不起作用。
这个怎么样?
import pandas as pd
# Mock data.
df = pd.DataFrame(
data={
"EOD (HIRE_DT)": pd.date_range(start="2020-01-01", end="today"),
"Grade Entry Date": pd.date_range(start="2020-01-01", end="today"),
"Step Date": pd.date_range(start="2020-01-01", end="today"),
"WGI Due Dt": pd.date_range(start="2020-01-01", end="today"),
}
)
# Specify start and end date in single variable.
date = ("2023-06-01", "2023-06-30")
# Iterate over each datetime series and check if values are between those in `date`.
keepers = (
df.apply(lambda ser: ser.between(*date))
# `any` will check if any of the values are True (same as using `|`).
.any(axis="columns")
)
# Apply filter
out = df.loc[keepers]
print(out.head())
EOD (HIRE_DT) Grade Entry Date Step Date WGI Due Dt
1247 2023-06-01 2023-06-01 2023-06-01 2023-06-01
1248 2023-06-02 2023-06-02 2023-06-02 2023-06-02
1249 2023-06-03 2023-06-03 2023-06-03 2023-06-03
1250 2023-06-04 2023-06-04 2023-06-04 2023-06-04
1251 2023-06-05 2023-06-05 2023-06-05 2023-06-05