如何找到 family = poisson(link = "log") 模型预测指定值的日期?

问题描述 投票:0回答:1

我的数据由日期值组成,此处表示为时间戳:

json <- structure(list(creation_date = c(1689356840L, 1689366017L, 1689376446L, 
1689504809L, 1690366268L, 1690472012L, 1690478723L, 1690701972L, 
1690701973L, 1690746389L, 1690746392L, 1690831235L, 1691022236L, 
1692097787L, 1692625463L, 1692699453L, 1692828527L, 1692828527L, 
1692828550L, 1692829734L, 1692878775L, 1692879861L, 1693058927L, 
1693082719L, 1693106370L, 1693115869L, 1693120632L, 1693121481L, 
1693137142L, 1693164782L, 1693172783L, 1693175185L, 1693208767L, 
1693909005L, 1693914013L, 1693921364L, 1693928549L, 1693930560L, 
1693999563L, 1694001298L, 1694003421L, 1694003758L, 1694005624L, 
1694015620L, 1694015743L, 1694074163L, 1694153769L, 1694181062L, 
1694194729L, 1694194740L, 1694194740L, 1694194744L, 1694209208L, 
1694594207L, 1694798845L, 1694798846L, 1694809497L, 1694883734L, 
1694901062L, 1695124304L, 1695221997L, 1695222830L, 1695259750L, 
1695259766L, 1695260229L, 1695283945L, 1695329985L, 1695330735L, 
1695363048L, 1695399875L, 1695403486L, 1695409526L, 1695458043L, 
1695464399L, 1695469820L, 1695472988L, 1695479007L, 1695483407L, 
1695485270L, 1695485763L, 1695485763L, 1695493324L, 1695505234L, 
1695505235L, 1695509011L, 1695517370L, 1695527643L, 1695556326L, 
1695567795L, 1695567798L, 1695575998L, 1695576929L, 1695589633L, 
1695596154L, 1695607375L, 1695633830L, 1695657647L, 1695658242L, 
1695710762L, 1695719101L), cumulative_reputation = c(100L, 110L, 
120L, 130L, 140L, 150L, 160L, 170L, 185L, 200L, 185L, 195L, 205L, 
215L, 213L, 228L, 230L, 240L, 230L, 240L, 250L, 252L, 262L, 272L, 
282L, 292L, 302L, 312L, 322L, 332L, 342L, 352L, 362L, 372L, 382L, 
392L, 402L, 412L, 422L, 432L, 442L, 452L, 462L, 472L, 482L, 492L, 
502L, 512L, 522L, 512L, 510L, 512L, 522L, 532L, 547L, 557L, 567L, 
577L, 587L, 597L, 607L, 617L, 632L, 617L, 627L, 637L, 647L, 657L, 
672L, 682L, 692L, 707L, 717L, 727L, 737L, 747L, 745L, 755L, 757L, 
747L, 745L, 755L, 753L, 755L, 765L, 775L, 785L, 795L, 805L, 795L, 
805L, 815L, 825L, 835L, 845L, 847L, 857L, 867L, 877L, 887L)), row.names = c(670L, 
669L, 668L, 667L, 666L, 665L, 664L, 663L, 662L, 661L, 660L, 659L, 
658L, 657L, 656L, 655L, 653L, 654L, 652L, 651L, 650L, 649L, 648L, 
647L, 646L, 645L, 644L, 643L, 642L, 641L, 640L, 639L, 638L, 637L, 
636L, 635L, 634L, 633L, 632L, 631L, 630L, 629L, 628L, 627L, 626L, 
625L, 624L, 623L, 622L, 620L, 621L, 619L, 618L, 617L, 616L, 615L, 
614L, 613L, 612L, 611L, 610L, 609L, 608L, 607L, 606L, 605L, 604L, 
603L, 602L, 601L, 600L, 599L, 598L, 597L, 596L, 595L, 594L, 593L, 
592L, 590L, 591L, 589L, 588L, 587L, 586L, 585L, 584L, 583L, 582L, 
581L, 580L, 579L, 578L, 577L, 576L, 575L, 574L, 573L, 572L, 571L
), class = "data.frame")

我已经为该数据拟合了一个模型,现在可以计算预测值:

model <- glm(cumulative_reputation ~ creation_date, data = json, family = poisson(link = "log"))
prediction <- predict(model, json)

如何找到模型预测特定值的日期(或时间戳)?

为了找到该日期,我计算了从最后一次测量日期 (1695719101) 到未来一年 (1695719101 + 365 * 24 * 60 * 60) 的值,然后在结果值中搜索最接近我的值想要(cumulative_reputation = 10000)。

有没有办法直接计算这个日期(或时间戳)?


注:

这个问题是这个问题的后续问题。

r regression prediction data-fitting
1个回答
0
投票

该模型为您提供了曲线方程,然后您可以使用该方程针对特定值求根。例如:

## This is essentially what 'predict(model, type="response")' does
COEF <- coef(model)
pred <- function(t) exp(COEF[1] + t*COEF[2])

## Find the root for a certain target response
optfn <- function(t, target=10000) pred(t) - target
uniroot(optfn, c(1E9, 5E9))$root |> as.POSIXct(tz="UTC")

#> "2023-12-20 12:07:04 UTC"

只是为了确认这个时间戳确实是解决方案:

predict(model, list(creation_date=1703074025), type="response")
#> 10000
© www.soinside.com 2019 - 2024. All rights reserved.