跨窗口函数使用当前行值的Bigquery

问题描述 投票:0回答:1

我想计算一场比赛的标准化时间的平均值和 STDDEV。我首先想通过完成时间/距离来计算每个赛车手的配速。接下来,我想要对当前行距离进行标准化的平均值。

例如计算第 5 行不包括第 5 行的平均值:AVG(Pace * current_distance) = (Pace_1 * distance_5 + Pace_2 * distance_5 + Pace_3 * distance_5 + Pace_4 * distance_5 ) / 4

步速_1 = 完成时间_1 / 距离_1

这就是我尝试用两种不同的方式做到这一点,但它不起作用:

  • AVG(步速 * 距离) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS avg_normalized_finishing_time1,它不是将步速乘以“第 5 行”距离

  • (AVG(finishing_time / distance) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING))*distance AS avg_normalized_finishing_time2, 首先取平均值,然后乘以距离。我认为这可能是正确的。但我认为这不适用于 STDDEV

sql google-bigquery aggregate-functions
1个回答
0
投票

任务是根据之前的配速时间计算某项体育赛事的预计完成时间。

让我们从 CTE 中的一些随机数据开始

sample
。然后在每个
tbl
中计算加法列。请使用窗口功能
partition by player
来区分播放器。
win1
适用于该玩家的所有比赛,
win2
仅适用于之前的比赛。由于任务是预测完成时间,因此我们将所有完成时间相加,并从当前比赛中减去完成时间。然后除以当前比赛的比赛次数减一,得出平均值。对于 STDDEV,使用公式 \sqrt{sum over i (x-average)^2 / N}

With sample as (
  Select "racer1" as player,offset as date,  finish_time, (offset+1)*1000 distance
  from 
  unnest([100,200,330,400,500]) finish_time with offset
),
tbl1 as (
  Select *,
  finish_time / distance as Pace
  from sample
  window win1 as (partition by player)
),
tbl2 as (
  Select *,
  avg(pace * distance) over win1 as avg_normalized_finishing,
  avg(pace * distance) over win2 as avg_normalized_finishing_previous_ones,
  sum(pace * distance) over win1 - pace * distance as sum_finishing_times,
  (sum(pace * distance) over win1 - pace * distance) / (count(pace * distance) over win1 -1) as avg_finishing_times,
  count(pace * distance) over win1 as races_done
  , 
  from tbl1
  window win1 as (partition by player), 
    win2 as (partition by player ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) 
),
tbl3 as (
Select * ,
SQRT((sum( pow(pace * distance- avg_finishing_times,2) )  over win1 - pow(pace * distance- avg_finishing_times,2))/ (races_done-1)) as STDDEV 
from tbl2
window win1 as (partition by player), 
    win2 as (partition by player ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) 
)

Select * from tbl3
© www.soinside.com 2019 - 2024. All rights reserved.