使用 SQL 的后进先出 (LIFO) 事务行配对

Question

我有一个事务表，我试图根据 LIFO 将流出映射到流入。

输入数据集

身份证	日期	类型	金额
1	2024年1月26日	流入	519
2	2024年1月26日	流出	100
3	2024年1月26日	流出	139
4	2024年1月26日	流出	122
5	2024年1月29日	流出	42
6	2024年1月29日	流入	713
7	2024年1月29日	流入	887
8	2024年1月29日	流出	92
9	2024年1月29日	流出	1593
10	2024年1月29日	流出	25

所需输出映射数据集

流入_Id	流入_日期	Outflow_Id	流出_日期	流入量	流出_金额	地图金额	流入_左	流出_左
1	2024年1月26日	2	2024年1月26日	519	100	100	419	0
1	2024年1月26日	3	2024年1月26日	519	139	139	280	0
1	2024年1月26日	4	2024年1月26日	519	122	122	158	0
1	2024年1月26日	5	2024年1月29日	519	42	42	116	0
7	2024年1月29日	8	2024年1月29日	887	92	92	795	0
7	2024年1月29日	9	2024年1月29日	887	1593	795	0	798
6	2024年1月29日	9	2024年1月29日	713	1593	713	0	85
1	2024年1月26日	9	2024年1月29日	519	1593	85	31	0
1	2024年1月26日	10	2024年1月29日	519	25	25	6	0

如果查看输出数据集的最后四行，流出 9,10 会映射到流入 1，因为流入 6 和 7 首先耗尽。

在 6 和 7 之间，两者都是流入，后续的流出在 6 之前先映射到 7（后进，所以先出）。

假设

想象一下银行账户交易 -

Sum(Outflows) 不大于 sum(inflows) 。尽管账户进入透支（负）在技术上是可能的，但让我们忽略这种情况，或者只是不分配该流出。
第一笔交易通常是流入。

要实现的逻辑：

对于每个流入，应首先映射流入之后的流出
如果流入后的流出量超过流入量，则后续的流出量应归因于先前的流入量。
如果流入尚未归属，则需要显示为没有流出映射的流入（可以稍后在将新流出添加到表中时进行映射）

来自以下链接的 FIFO 逻辑使用运行总计和先前总计似乎很简单 - SQL Server 2018 中根据 FIFO 逻辑进行行配对

但后进先出似乎更复杂，因为感觉需要某种递归，因为每次流入后都需要重置运行总计，并且需要将流出分配给以前的流入。我实际上更喜欢没有递归的解决方案，但我不知道这是否可能

我正在尝试在 GCP (Bigquery) 上实现这个数据集，该数据集包含每个客户的 100 笔交易，大约有 400 万客户

Answer 1

LIFO 适用于离散值：

WITH raw as (SELECT "u" as user, id+1 as id, x from unnest([1,1,1,1,1,-1,-1,1,-1,-1,1,1,-1,-1,-1]) as x with offset id),
 tbl1 as (
SELECT * ,
sum(x) over win1 as sums
from raw
window win1 as (partition by user order by id rows between unbounded preceding and current row)
)
SELECT *,
if(x<0,last_value(id) over win1,null) as take_last_id
from tbl1
window win1 as (partition by user,(if(x<0,sums+1,sums)) order by id rows between unbounded preceding and 1  preceding)
order by 1,2

因此，最好的方法是按每个值取消数据集的嵌套。

对于连续值，仅当该值向下舍入为已存在的条目时，

partion by

才起作用。这是在子选择中完成的。

WITH raw as (SELECT "u" as user, id+1 as id, x from 
unnest([519,-100,-139,-122,-42,713,887,-92,-1593,-25]) 
#unnest([519,-100,-139,-122,-42,713,887,-92,-500,-500,-500,-25,5000,-200]) 
as x with offset id),
 tbl1 as (
SELECT * ,
sum(x) over win1 as sums
from raw
window win1 as (partition by user order by id rows between unbounded preceding and current row)
), tbl2 as (
  SELECT *,
  array_agg(struct(sums as sumsv,x as xv)) over win1 as flow,
  from tbl1
window win1 as (partition by user order by id rows between unbounded preceding and current row)  
),
tbl3 as (
  SELECT * ,
  (( 
    SELECT any_value(y.sumsv having min out_ok) from(
  SELECT y,if(y.sumsv>=sums-x and id>in1+1 and y.xv>0,in1,null) as out_ok  from unnest(flow) as y with offset in1 
    )
 )) as sum_find_prev,
     from tbl2
),tbl4 as (
SELECT *,
if(x<0,last_value(if(x>0,id,null) ignore nulls) over win2,null) as take_last_id,

from tbl3
window win2 as (partition by user,(if(x<0,sum_find_prev,sums)) order by id rows between unbounded preceding and 1  preceding)
)

SELECT * except(flow),

 from tbl4
order by 1,2

使用 SQL 的后进先出 (LIFO) 事务行配对

问题描述投票：0回答：1

1个回答

最新问题

使用 SQL 的后进先出 (LIFO) 事务行配对

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1