我正在尝试通过添加指定的工作日数来调整日期,但我想调整周末。然而,周末可能会根据记录而改变。所以如果我的数据集看起来像这样:
┌────────────┬────────┬──────────┬──────────┐
│ DT ┆ N_DAYS ┆ WKND1 ┆ WKND2 │
│ --- ┆ --- ┆ --- ┆ --- │
│ date ┆ i64 ┆ str ┆ str │
╞════════════╪════════╪══════════╪══════════╡
│ 2025-01-02 ┆ 2 ┆ Saturday ┆ Sunday │
│ 2025-01-09 ┆ 2 ┆ Friday ┆ Saturday │
│ 2025-01-10 ┆ 2 ┆ Saturday ┆ null │
│ 2025-01-15 ┆ 1 ┆ Saturday ┆ Sunday │
└────────────┴────────┴──────────┴──────────┘
我可以申请:
df = df.with_columns(pl.col('DT').dt.add_business_days(pl.col('N_DAYS')).alias('NEW_DT'))
┌────────────┬────────┬──────────┬──────────┬────────────┐
│ DT ┆ N_DAYS ┆ WKND1 ┆ WKND2 ┆ NEW_DT │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ date ┆ i64 ┆ str ┆ str ┆ date │
╞════════════╪════════╪══════════╪══════════╪════════════╡
│ 2025-01-02 ┆ 2 ┆ Saturday ┆ Sunday ┆ 2025-01-06 │
│ 2025-01-09 ┆ 2 ┆ Friday ┆ Saturday ┆ 2025-01-13 │
│ 2025-01-10 ┆ 2 ┆ Saturday ┆ null ┆ 2025-01-14 │
│ 2025-01-15 ┆ 1 ┆ Saturday ┆ Sunday ┆ 2025-01-16 │
└────────────┴────────┴──────────┴──────────┴────────────┘
但是,我一直在尝试根据列
week_mask
为每个记录生成一个 WKND1, WKND2
元组,并将其应用为我的转换的一部分,因此对于第一个记录,元组应该是:
(True, True, True, True, True, False, False)
第二条记录是:
(True, True, True, True, False, False, True)
等等。
根据下面的示例,实际响应应该是:
┌────────────┬────────┬──────────┬──────────┬────────────┐
│ DT ┆ N_DAYS ┆ WKND1 ┆ WKND2 ┆ NEW_DT │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ date ┆ i64 ┆ str ┆ str ┆ date │
╞════════════╪════════╪══════════╪══════════╪════════════╡
│ 2025-01-02 ┆ 2 ┆ Saturday ┆ Sunday ┆ 2025-01-06 │
│ 2025-01-09 ┆ 2 ┆ Friday ┆ Saturday ┆ 2025-01-14 │
│ 2025-01-10 ┆ 2 ┆ Saturday ┆ null ┆ 2025-01-13 │
│ 2025-01-15 ┆ 1 ┆ Saturday ┆ Sunday ┆ 2025-01-16 │
└────────────┴────────┴──────────┴──────────┴────────────┘
如何根据列值生成元组并动态应用它?
我尝试创建一个包含列表的新列并使用如下内容:
df = df.with_columns(pl.col('DT').dt.add_business_days(pl.col('N_DAYS'), week_mask=pl.col('W_MASK')).alias('NEW_DT'))
但是得到:
TypeError: argument 'week_mask': 'Expr' object cannot be converted to 'Sequence'
week_mask
应该是Iterable
,所以看来你不能在那里传递表达式。
不过,您可以迭代不同的假期:
pl.DataFrame.partition_by()
将 DataFrame 拆分为数据帧的字典。week_mask
。pl.concat()
将结果数据帧连接在一起。pl.concat([
v.with_columns(
pl.col('DT').dt.add_business_days(
pl.col('N_DAYS'),
week_mask=[x not in k for x in weekdays]
).alias('NEW_DT')
) for k, v in df.partition_by("WKND1","WKND2", as_dict = True).items()
])
shape: (4, 5)
┌────────────┬────────┬──────────┬──────────┬────────────┐
│ DT ┆ N_DAYS ┆ WKND1 ┆ WKND2 ┆ NEW_DT │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ date ┆ i64 ┆ str ┆ str ┆ date │
╞════════════╪════════╪══════════╪══════════╪════════════╡
│ 2025-01-02 ┆ 2 ┆ Saturday ┆ Sunday ┆ 2025-01-06 │
│ 2025-01-15 ┆ 1 ┆ Saturday ┆ Sunday ┆ 2025-01-16 │
│ 2025-01-09 ┆ 2 ┆ Friday ┆ Saturday ┆ 2025-01-13 │
│ 2025-01-10 ┆ 2 ┆ Saturday ┆ null ┆ 2025-01-13 │
└────────────┴────────┴──────────┴──────────┴────────────┘