如何选择 V1 行 3、6、12 个月间隔 假设有如下表所示的表格
月 | V1 |
---|---|
202307 | 10 |
202306 | 20 |
202305 | 30 |
202304 | 40 |
202303 | 50 |
202302 | 60 |
202301 | 70 |
我想像下表一样创建它。 V2是3个月前的,V3是6个月前的
月 | V1 | V2 | V3 |
---|---|---|---|
202307 | 10 | 30 | 60 |
检查下面。
df
.withColumn(
"first_date",
expr("""FIRST(TO_DATE(month, 'yyyyMM')) OVER(ORDER BY month DESC)""")
)
.withColumn(
"diff",
expr("""CAST(MONTHS_BETWEEN(first_date, TO_DATE(month, 'yyyyMM')) AS INT) + 1""")
)
.selectExpr("FILTER(collect_list(struct(first_date, v1, diff)),e -> e.diff == 1 OR e.diff % 3 == 0) AS list")
.selectExpr(
"list.first_date[0] AS month",
"FILTER(list, e -> e.diff == 1).v1[0] AS V1",
"FILTER(list, e -> e.diff == 3).v1[0] AS V2",
"FILTER(list, e -> e.diff == 6).v1[0] AS V3"
)
.show(false)