为什么“segmented”包中的“segmented”和“selgmented”函数给出不同的输出?

问题描述 投票:0回答:1

我使用

selgmented
包中的
segmented
函数来选择时间序列回归中的断点数量。该函数的行为符合预期并提供了一个
segmented
对象。但是,如果我使用
segmented
函数运行相同的数据集,并将
npsi
设置为由
selgmented
确定的断点数量,有时会返回不同的结果(即不同的段斜率和不同的断点)。我尝试使用
seg.control
修改一些参数,但通常无法在两个函数之间获得相同的结果。我想知道这些函数之间有什么不同以及为什么它们会产生不同的结果。通常这些差异很小,但我想了解它们发生的原因。

我提供了一个使用分段包中包含的植物器官数据集的示例。

library(segmented) 

data(plant)
plant_RKV <- plant[which(plant$group=="RKV"), ]

out.lm <- lm(y ~ time, data=plant_RKV)

os <- selgmented(out.lm, type="bic", Kmax=5)
o <- segmented(out.lm, npsi=4)

> summary(os)

        ***Regression Model with Segmented Relationship(s)***

Call: 
segmented.lm(obj = olm, seg.Z = seg.Z, psi = startpsi[[i - 1]], 
    control = control1)

Estimated Break-Point(s):
              Est. St.Err
psi1.time 314.495 13.909
psi2.time 471.338 15.939
psi3.time 542.613  9.392
psi4.time 565.911  7.278

Meaningful coefficients of the linear terms:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.0694477  0.0281773   2.465    0.022 *  
time         0.0024848  0.0001249  19.895 1.49e-15 ***
U1.time     -0.0015098  0.0002187  -6.903       NA    
U2.time     -0.0030369  0.0011887  -2.555       NA    
U3.time      0.0056947  0.0022973   2.479       NA    
U4.time     -0.0043988  0.0019883  -2.212       NA    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.02418 on 22 degrees of freedom
Multiple R-Squared: 0.9856,  Adjusted R-squared: 0.9798 

Convergence attained in 3 iterations (rel. change 2.3914e-10)

> summary(o)

        ***Regression Model with Segmented Relationship(s)***

Call: 
segmented.lm(obj = out.lm, npsi = 4)

Estimated Break-Point(s):
              Est. St.Err
psi1.time 295.723 16.654
psi2.time 458.156 11.044
psi3.time 544.410  8.446
psi4.time 566.120  7.659

Meaningful coefficients of the linear terms:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.0497251  0.0359206   1.384     0.18    
time         0.0025961  0.0001788  14.522  9.4e-13 ***
U1.time     -0.0014525  0.0002439  -5.956       NA    
U2.time     -0.0027260  0.0006178  -4.412       NA    
U3.time      0.0050112  0.0020535   2.440       NA    
U4.time     -0.0041930  0.0019796  -2.118       NA    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.02407 on 22 degrees of freedom
Multiple R-Squared: 0.9858,  Adjusted R-squared: 0.9799 

Boot restarting based on 6 samples. Last fit:
Convergence *not* attained in 1 iterations (rel. change 0.062135)

如果我设置引导样本的数量和最大迭代次数,那么我可以获得对齐的输出。

out.lm <- lm(y ~ time, data=data_plant)

os <- selgmented(out.lm, type="bic", Kmax=5, control=seg.control(n.boot=1000, it.max = 5))
o <- segmented(out.lm, npsi=4, control=seg.control(n.boot=1000, it.max = 5))

> summary(os)

        ***Regression Model with Segmented Relationship(s)***

Call: 
segmented.lm(obj = olm, seg.Z = seg.Z, psi = startpsi[[i - 1]], 
    control = control1)

Estimated Break-Point(s):
              Est. St.Err
psi1.time 300.220 14.015
psi2.time 469.728 15.680
psi3.time 542.613  9.245
psi4.time 565.911  7.164

Meaningful coefficients of the linear terms:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.0519970  0.0307320   1.692    0.105    
time         0.0025830  0.0001437  17.971 1.23e-14 ***
U1.time     -0.0015229  0.0002118  -7.190       NA    
U2.time     -0.0031220  0.0011670  -2.675       NA    
U3.time      0.0056947  0.0022612   2.518       NA    
U4.time     -0.0043988  0.0019571  -2.248       NA    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.0238 on 22 degrees of freedom
Multiple R-Squared: 0.9861,  Adjusted R-squared: 0.9804 

Convergence *not* attained in 5 iterations (rel. change 0.00043136) 

> summary(o)

        ***Regression Model with Segmented Relationship(s)***

Call: 
segmented.lm(obj = out.lm, npsi = 4, control = seg.control(n.boot = 1000, 
    it.max = 5))

Estimated Break-Point(s):
              Est. St.Err
psi1.time 300.220 14.015
psi2.time 469.728 15.680
psi3.time 542.613  9.245
psi4.time 565.911  7.164

Meaningful coefficients of the linear terms:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.0519970  0.0307320   1.692    0.105    
time         0.0025830  0.0001437  17.971 1.23e-14 ***
U1.time     -0.0015229  0.0002118  -7.190       NA    
U2.time     -0.0031220  0.0011670  -2.675       NA    
U3.time      0.0056947  0.0022612   2.518       NA    
U4.time     -0.0043988  0.0019571  -2.248       NA    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.0238 on 22 degrees of freedom
Multiple R-Squared: 0.9861,  Adjusted R-squared: 0.9804 

Boot restarting based on 12 samples. Last fit:
Convergence attained in 4 iterations (rel. change 2.2993e-12)

但是,使用

seg.control
更改参数似乎并不适用于所有示例。部分原因可能是由于我不熟悉这些参数的具体情况。

这是我自己的数据集(时间序列)中的第二个示例,其中两个函数之间的断点明显不同。我无法修改参数以使输出对齐。

x <- c(0, 43, 156, 238, 254, 323, 674, 870, 1193, 1555, 1926, 2279, 2660, 3103, 3479, 3832, 4214, 4929, 5665, 6401, 7122, 7850, 9314, 10062, 11151, 11869)
y <- c(7.31322039, 7.32974969, 6.85224257, 7.11882625, 6.56526497, 6.13122649, 6.30991828, 5.88610403, 4.78749174, 3.76120012, 4.02535169, 4.09434456, 2.23216263, 1.97685495, 2.86789890, 2.80336038, 1.28093385, 0.53062825, 0.83290912, 1.02961942, 0.64185389, 0.18232156, 0.09531018, 0.09531018, 0.33647224, 0.51282363)

out.lm2 <- lm(y ~ x)

os2 <- selgmented(out.lm2, type="bic", Kmax=5)
o2 <- segmented(out.lm2, npsi=2)

par(mfrow=c(1,2))
plot.segmented(os2)
plot.segmented(os2, res=TRUE, add=TRUE, conf.level=0.95, shade=TRUE)
plot.segmented(o2)
plot.segmented(o2, res=TRUE, add=TRUE, conf.level=0.95, shade=TRUE) 

comparison of segmented and selgmented piecewise regressions

我很感激任何有关

segmented
selgmented
函数的输出为何不同以及如何修改(如果可能)的反馈。

r regression segment piecewise
1个回答
0
投票

您最终对此有任何见解吗? 我有同样的问题。我通过分段获得了更好的结果(也就是说,它们更接近我通过查看和了解数据所期望的结果),但我无法证明它比分段使用更合理。

© www.soinside.com 2019 - 2024. All rights reserved.