学习用户定义的函数进行 ANOVA 和 emmeans 成对比较

问题描述 投票:0回答:1

我正在尝试学习编写函数并探索制作一个函数来进行方差分析和后 F 测试。我已将其简化为获取 emmeans 并关联所有成对比较的问题。问题是成对比较不起作用???我对功能真的很陌生,并且尝试了很多方法来解决这个问题但无济于事......

我正在使用钻石数据集来提高再现性......

这是代码:

# libraries
library(tidyverse)
library(car)
library(emmeans)
library(readxl)
library(rlang)
library(broom)

dia <- diamonds
dia$cut <- as.factor(as.character(dia$cut))

results_LM <- function(data, iv, dv) {
  iv_name <- as.character(substitute(iv))
  dv_name <- as.character(substitute(dv))
  formula_str <- paste(dv_name, "~", iv_name)
  aov.model <- lm(formula_str, data = data)
  emm.means <- emmeans(aov.model,
                       specs = iv_name,
                       by = iv_name)
  emm.pairs <- pairs(emm.means)
  output <- list(
    aov_model_summary = summary(aov.model),
    emm.means,
    emm.pairs)
  return(output)
}

results_LM(dia, cut, price)

这个问题是 emmeans 输出看起来很奇怪并且缺少成对比较

[[3]]
cut = Fair:
 contrast  estimate SE df z.ratio p.value
 (nothing)   nonEst NA NA      NA      NA

cut = Good:
 contrast  estimate SE df z.ratio p.value
 (nothing)   nonEst NA NA      NA      NA

cut = Ideal:
 contrast  estimate SE df z.ratio p.value
 (nothing)   nonEst NA NA      NA      NA

cut = Premium:
 contrast  estimate SE df z.ratio p.value
 (nothing)   nonEst NA NA      NA      NA

cut = Very Good:
 contrast  estimate SE df z.ratio p.value
 (nothing)   nonEst NA NA      NA      NA```

Note that when I do the anova and emmeans post F test outside of the function it works fine? 

Any help is appreciated and pointers to learn to write functions like this on dataframes would be GREAT. Thanks Bill
r dataframe user-defined-functions rlang emmeans
1个回答
0
投票

你们真的很亲密。我认为唯一的事情是你不需要在调用

by
时使用
emmeans()
参数。我可能还建议使用
reformulate()
来制作公式。见下图:

library(tidyverse)
library(emmeans)

dia <- diamonds
dia$cut <- as.factor(as.character(dia$cut))

results_LM <- function(data, iv, dv) {
  iv_name <- as.character(substitute(iv))
  dv_name <- as.character(substitute(dv))
  formula_str <- reformulate(iv_name, response=dv_name)
  aov.model <- lm(formula_str, data = data)
  emm.means <- emmeans(aov.model,
                       specs = iv_name)
  emm.pairs <- pairs(emm.means)
  output <- list(
    aov_model_summary = summary(aov.model),
    emm.means,
    emm.pairs)
  names(output) <- c("Model", "Marginal Means", "Pairwise Comparisons")
  return(output)
}

results_LM(dia, cut, price)
#> $Model
#> 
#> Call:
#> lm(formula = formula_str, data = data)
#> 
#> Residuals:
#>    Min     1Q Median     3Q    Max 
#>  -4258  -2741  -1494   1360  15348 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)   4358.76      98.79  44.122  < 2e-16 ***
#> cutGood       -429.89     113.85  -3.776 0.000160 ***
#> cutIdeal      -901.22     102.41  -8.800  < 2e-16 ***
#> cutPremium     225.50     104.40   2.160 0.030772 *  
#> cutVery Good  -377.00     105.16  -3.585 0.000338 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 3964 on 53935 degrees of freedom
#> Multiple R-squared:  0.01286,    Adjusted R-squared:  0.01279 
#> F-statistic: 175.7 on 4 and 53935 DF,  p-value: < 2.2e-16
#> 
#> 
#> $`Marginal Means`
#>  cut       emmean   SE    df lower.CL upper.CL
#>  Fair        4359 98.8 53935     4165     4552
#>  Good        3929 56.6 53935     3818     4040
#>  Ideal       3458 27.0 53935     3405     3510
#>  Premium     4584 33.8 53935     4518     4650
#>  Very Good   3982 36.1 53935     3911     4052
#> 
#> Confidence level used: 0.95 
#> 
#> $`Pairwise Comparisons`
#>  contrast            estimate    SE    df t.ratio p.value
#>  Fair - Good            429.9 113.8 53935   3.776  0.0015
#>  Fair - Ideal           901.2 102.4 53935   8.800  <.0001
#>  Fair - Premium        -225.5 104.4 53935  -2.160  0.1950
#>  Fair - Very Good       377.0 105.2 53935   3.585  0.0031
#>  Good - Ideal           471.3  62.7 53935   7.517  <.0001
#>  Good - Premium        -655.4  65.9 53935  -9.946  <.0001
#>  Good - Very Good       -52.9  67.1 53935  -0.788  0.9341
#>  Ideal - Premium      -1126.7  43.2 53935 -26.067  <.0001
#>  Ideal - Very Good     -524.2  45.1 53935 -11.636  <.0001
#>  Premium - Very Good    602.5  49.4 53935  12.198  <.0001
#> 
#> P value adjustment: tukey method for comparing a family of 5 estimates

创建于 2023-08-09,使用 reprex v2.0.2

© www.soinside.com 2019 - 2024. All rights reserved.