在caret包中:当使用度数=2的gamLoess时,training()会崩溃。

问题描述 投票:0回答:1

我正试图用LOESS ( )对广义加成模型进行5倍交叉验证。gamLoess 来自 gam 包),使用 caret 包。我想测试所有可能的学位选项(, degree= 0,1和2)。) 问题是 R 崩溃,当我使用 degree=2. 我以前也见过类似的问题(例如, 当使用caret和method=gamLoess进行训练时,R会崩溃),但我不明白如何解决它。它看起来像一个bug,在 gam 包。有没有解决这个问题?

我的代码是。

#The data

turning_rate_4954

# A tibble: 100 x 2
     Time_s turn_rate_dgs_s
      <dbl>           <dbl>
  1 0                   0  
  2 0.00416           115. 
  3 0.00832           559. 
  4 0.0125            935. 
  5 0.0166            986. 
  6 0.0208           1606. 
  7 0.0250           1578. 
  8 0.0291           2195. 
  9 0.0333           1178. 
 10 0.0374           1699. 
 11 0.0416           1875. 
 12 0.0458           1648. 
 13 0.0499           1597. 
 14 0.0541           2239. 
 15 0.0582           2221. 
 16 0.0624           2278. 
 17 0.0666           1783. 
 18 0.0707           1678. 
 19 0.0749           1747. 
 20 0.0790           1479. 
 21 0.0832           2035. 
 22 0.0874           2378. 
 23 0.0915           1826. 
 24 0.0957           1659. 
 25 0.0998           2344. 
 26 0.104            1839. 
 27 0.108            1044. 
 28 0.112            1789. 
 29 0.116             721. 
 30 0.121             946. 
 31 0.125             143. 
 32 0.129             376. 
 33 0.133               0  
 34 0.137            -418. 
 35 0.141             127. 
 36 0.146           -1053. 
 37 0.150            -535. 
 38 0.154              87.4
 39 0.158            -437. 
 40 0.162            -730. 
 41 0.166            -441. 
 42 0.171            -553. 
 43 0.175            -893. 
 44 0.179            -694. 
 45 0.183            -847. 
 46 0.187             313. 
 47 0.191             581. 
 48 0.196            1121. 
 49 0.200            1753. 
 50 0.204            1504. 
 51 0.208            1185. 
 52 0.212            1659. 
 53 0.216             802. 
 54 0.220            1570. 
 55 0.225            1521. 
 56 0.229            1620. 
 57 0.233             732. 
 58 0.237            1263. 
 59 0.241            1590. 
 60 0.245            1279. 
 61 0.250            1133. 
 62 0.254            -187. 
 63 0.258             187. 
 64 0.262             165. 
 65 0.266             183. 
 66 0.270            -507. 
 67 0.275               0  
 68 0.279            -376. 
 69 0.283             376. 
 70 0.287            -492. 
 71 0.291            -147. 
 72 0.295            -468. 
 73 0.300            -322. 
 74 0.304            -122. 
 75 0.308            -273. 
 76 0.312             139. 
 77 0.316             615. 
 78 0.320             346. 
 79 0.324            1011. 
 80 0.329            1114. 
 81 0.333            1315. 
 82 0.337             737. 
 83 0.341             858. 
 84 0.345            1374. 
 85 0.349             816. 
 86 0.354             488. 
 87 0.358             979. 
 88 0.362              69.2
 89 0.366             304. 
 90 0.370             622. 
 91 0.374            -195. 
 92 0.379             497. 
 93 0.383            -199. 
 94 0.387             492. 
 95 0.391              40.6
 96 0.395             170. 
 97 0.399             -39.0
 98 0.404            -258. 
 99 0.408               0  
100 0.412             258. 



#Cross Validation
library(caret)
library(gam)

#Control to get a 5-fold cross validation, 10 samples using 10% of the observation each
control <- trainControl(method= "cv",   
                        number= 5,      
                        p= 0.9,
                        savePrediction= T )

grid <- expand.grid(span = seq(0.1, 0.65, len = 10), 
                    degree = c(0,1,2)                   )

train_loess <- train(turn_rate_dgs_s ~ Time_s,
                        method = "gamLoess",
                        tuneGrid = grid,
                        trControl= control,
                        data = turning_rate_4954)


我曾经做过一次,包括2度,但无法保存结果在这里显示。不过再也没有用过,只是如果我做的是为 degree=(0,1),然后就可以正常工作了。

我使用的是

R 3.6.2版本

caret 版本6.0.86

gam 版本1.16.1

Mac OS Mojave 10.14.6

会话信息来自 sessionInfo()

> sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_NZ.UTF-8/en_NZ.UTF-8/en_NZ.UTF-8/C/en_NZ.UTF-8/en_NZ.UTF-8

attached base packages:
[1] splines   stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] gam_1.16.1        foreach_1.5.0     caret_6.0-86      lattice_0.20-38   fANCOVA_0.5-1     readxl_1.3.1      patchwork_1.0.0   viridis_0.5.1    
 [9] viridisLite_0.3.0 forcats_0.5.0     stringr_1.4.0     purrr_0.3.4       readr_1.3.1       tidyr_1.1.0       tibble_3.0.1      ggplot2_3.3.1    
[17] tidyverse_1.3.0   dplyr_1.0.0       plyr_1.8.6       

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.4           lubridate_1.7.8      class_7.3-15         assertthat_0.2.1     ipred_0.9-9          utf8_1.1.4           R6_2.4.1            
 [8] cellranger_1.1.0     backports_1.1.7      stats4_3.6.2         reprex_0.3.0         httr_1.4.1           pillar_1.4.4         rlang_0.4.6         
[15] rematch_1.0.1        data.table_1.12.8    rstudioapi_0.11      blob_1.2.1           rpart_4.1-15         Matrix_1.2-18        gower_0.2.1         
[22] munsell_0.5.0        broom_0.5.6          compiler_3.6.2       modelr_0.1.8         pkgconfig_2.0.3      nnet_7.3-12          tidyselect_1.1.0    
[29] prodlim_2019.11.13   gridExtra_2.3        codetools_0.2-16     fansi_0.4.1          crayon_1.3.4         dbplyr_1.4.4         withr_2.2.0         
[36] ModelMetrics_1.2.2.2 MASS_7.3-51.4        recipes_0.1.12       grid_3.6.2           nlme_3.1-142         jsonlite_1.6.1       gtable_0.3.0        
[43] lifecycle_0.2.0      DBI_1.1.0            magrittr_1.5         pROC_1.16.2          scales_1.1.1         cli_2.0.2            stringi_1.4.6       
[50] reshape2_1.4.4       fs_1.4.1             timeDate_3043.102    xml2_1.3.2           ellipsis_0.3.1       generics_0.0.2       vctrs_0.3.0         
[57] lava_1.6.7           iterators_1.0.12     tools_3.6.2          glue_1.4.1           hms_0.5.3            survival_3.1-8       colorspace_1.4-1    
[64] rvest_0.3.5          haven_2.3.1         
> 
r r-caret
1个回答
0
投票

不是对原始问题的回答,而是对Todd Burus的评论。特别是关于警告的问题(见评论)。不知道这一切是否能与segfault(和R崩溃)问题有关。

当使用度数0和1时(没有让R使用度数=2时崩溃),就像这样。

grid <- expand.grid(span = seq(0.1, 0.65, len = 10), 
                    degree = seq(0,1, len=2)                    ) 

train_loess <- train(turn_rate_dgs_s ~ Time_s,
                        method = "gamLoess",
                        tuneGrid = grid,
                        trControl= control,
                        data = turning_rate_4954)

警告

In lo.wam(x, z, wz, fit$smooth, which, fit$smooth.frame,  ... :
  degree must be at least 1 for vertex influence matrix

是通过不使用degree=0来解决的,因为在 gam.lo 度仅限于1和2。与之相反的是 stats::loess 其中,你可以使用0、1或2(但请参见 ?loess).

当只使用 degree=1 仍然有一些警告。但我不明白这些警告是什么,也不知道这和原来的问题有什么关系,关于segfault(gam 虫)。)

1: In model.matrix.default(mt, mf, contrasts) :
  non-list contrasts argument ignored
2: In gam.lo(data[["lo(Time_s, span = 0.1, degree = 1)"]],  ... : eval  0
3: In gam.lo(data[["lo(Time_s, span = 0.1, degree = 1)"]],  ... :
  lowerlimit  0.0021424
4: In gam.lo(data[["lo(Time_s, span = 0.1, degree = 1)"]],  ... :
  extrapolation not allowed with blending
5: In gam.lo(data[["lo(Time_s, span = 0.1, degree = 1)"]],  ... : eval  0.41184
6: In gam.lo(data[["lo(Time_s, span = 0.1, degree = 1)"]],  ... : upperlimit  0.4097
© www.soinside.com 2019 - 2024. All rights reserved.