我想计算一场比赛的标准化时间的平均值和 STDDEV。我首先想通过完成时间/距离来计算每个赛车手的配速。接下来,我想要对当前行距离进行标准化的平均值。
例如计算第 5 行不包括第 5 行的平均值:AVG(Pace * current_distance) = (Pace_1 * distance_5 + Pace_2 * distance_5 + Pace_3 * distance_5 + Pace_4 * distance_5 ) / 4
步速_1 = 完成时间_1 / 距离_1
这就是我尝试用两种不同的方式做到这一点,但它不起作用:
AVG(步速 * 距离) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS avg_normalized_finishing_time1,它不是将步速乘以“第 5 行”距离
(AVG(finishing_time / distance) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING))*distance AS avg_normalized_finishing_time2, 首先取平均值,然后乘以距离。我认为这可能是正确的。但我认为这不适用于 STDDEV
任务是根据之前的配速时间计算某项体育赛事的预计完成时间。
让我们从 CTE 中的一些随机数据开始
sample
。然后在每个 tbl
中计算加法列。请使用窗口功能partition by player
来区分播放器。 win1
适用于该玩家的所有比赛,win2
仅适用于之前的比赛。由于任务是预测完成时间,因此我们将所有完成时间相加,并从当前比赛中减去完成时间。然后除以当前比赛的比赛次数减一,得出平均值。对于 STDDEV,使用公式
With sample as (
Select "racer1" as player,offset as date, finish_time, (offset+1)*1000 distance
from
unnest([100,200,330,400,500]) finish_time with offset
),
tbl1 as (
Select *,
finish_time / distance as Pace
from sample
window win1 as (partition by player)
),
tbl2 as (
Select *,
avg(pace * distance) over win1 as avg_normalized_finishing,
avg(pace * distance) over win2 as avg_normalized_finishing_previous_ones,
sum(pace * distance) over win1 - pace * distance as sum_finishing_times,
(sum(pace * distance) over win1 - pace * distance) / (count(pace * distance) over win1 -1) as avg_finishing_times,
count(pace * distance) over win1 as races_done
,
from tbl1
window win1 as (partition by player),
win2 as (partition by player ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
),
tbl3 as (
Select * ,
SQRT((sum( pow(pace * distance- avg_finishing_times,2) ) over win1 - pow(pace * distance- avg_finishing_times,2))/ (races_done-1)) as STDDEV
from tbl2
window win1 as (partition by player),
win2 as (partition by player ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
)
Select * from tbl3