希望过滤和/或整理 TukeyHSD 成对比较以进行双向方差分析

问题描述 投票:0回答:1

我正在运行双向方差分析来比较两个变量如何影响服务价格。然而,当我运行事后成对比较时,我只想看到可比较的行(即当一个自变量匹配时)。这是一个显示问题的示例:

变量 1 - 种类:金枪鱼、大比目鱼、鲑鱼、比目鱼

变量 2 - 性别:男、女

因变量:质量

这是我用于双向方差分析和成对比较的内容:

> dat <- read.csv(<file location>)
> aov2 <- aov(mass ~ species * sex, data = dat)
> TukeyHSD(aov2, which = "species:sex")

当我使用 TukeyHSD 进行质量的成对比较时,我只想看到金枪鱼:雄性 - 金枪鱼:雌性或大比目鱼:雄性 - 比目鱼:雄性,但我不想看到比目鱼:雌性 - 大比目鱼:男性。基本上,我只是想看看其中一个变量具有相同值的比较。

老实说我什至不确定要尝试什么,或者这是否可能。我有一些 NA 值用于一些比较,我尝试使用:

TukeyHSD(aov2, which = "species:sex") %>% filter(is.na(diff))

但我得到以下信息:

Error in UseMethod("filter") : 
no applicable method for 'filter' applied to an object of class "c('TukeyHSD', 'multicomp')"

如果这是不可能的,是否有办法在数据帧进入单向方差分析之前过滤数据帧中的数据,例如通过其中一列中的某个值?然后我可以尝试运行多个单向方差分析来实现类似的目标。我宁愿不必创建单独的数据文件,因为我正在使用的实际数据集非常大,并且手动分离会非常耗时。

r statistics anova posthoc tukey
1个回答
0
投票

当前的

TukeyHSD
功能无法实现这一点。但是,通过对此函数进行一些修改,您可以获得所需的输出。下面是修改后的函数,它添加了一个名为“match”的新参数,默认值为 FALSE:

TukeyHSD2 <- function (x, which = seq_along(tabs), ordered = FALSE, conf.level = 0.95, 
          match=FALSE, ...) 
{
  mm <- model.tables(x, "means")
  if (is.null(mm$n)) 
    stop("no factors in the fitted model")
  tabs <- mm$tables
  if (names(tabs)[1L] == "Grand mean") 
    tabs <- tabs[-1L]
  tabs <- tabs[which]
  nn <- mm$n[names(tabs)]
  nn_na <- is.na(nn)
  if (all(nn_na)) 
    stop("'which' specified no factors")
  if (any(nn_na)) {
    warning("'which' specified some non-factors which will be dropped")
    tabs <- tabs[!nn_na]
    nn <- nn[!nn_na]
  }
  out <- setNames(vector("list", length(tabs)), names(tabs))
  MSE <- sum(x$residuals^2)/x$df.residual
  for (nm in names(tabs)) {
    tab <- tabs[[nm]]
    means <- as.vector(tab)
    nms <- if (length(dim(tab)) > 1L) {
      dn <- dimnames(tab)
      apply(do.call("expand.grid", dn), 1L, paste, collapse = ":")
    }
    else names(tab)
    n <- nn[[nm]]
    if (length(n) < length(means)) 
      n <- rep.int(n, length(means))
    if (as.logical(ordered)) {
      ord <- order(means)
      means <- means[ord]
      n <- n[ord]
      if (!is.null(nms)) 
        nms <- nms[ord]
    }
    
    center <- outer(means, means, `-`)
    keep <- lower.tri(center)
    
    dnames <- list(outer(nms, nms, paste, sep = "-"), c("diff", "lwr", "upr", "p adj"))
    dk1 <- dnames[[1L]][keep]
    
    if(match) {
      
      keep2 <- sapply(1:length(dk), function(x) {
        pattern <- '(.+):(.+)-(.+):(.+)'
        d1 <- sub(pattern, '\\1', dk1[x])
        d2 <- sub(pattern, '\\2', dk1[x])
        d3 <- sub(pattern, '\\3', dk1[x])
        d4 <- sub(pattern, '\\4', dk1[x])
        d1==d3 | d2==d4 } )
      
      dnames[[1L]] <- dk1[keep2]
    } else dnames[[1L]] <- dk1

    center <- center[keep]
    width <- qtukey(conf.level, length(means), x$df.residual) * 
      sqrt((MSE/2) * outer(1/n, 1/n, `+`))[keep]
    est <- center/(sqrt((MSE/2) * outer(1/n, 1/n, `+`))[keep])
    
    if(match) {
      center <- center[keep2]
      width <- width[keep2]
      est <- est[keep2]
    }
    
    pvals <- ptukey(abs(est), length(means), x$df.residual, 
                    lower.tail = FALSE)
    out[[nm]] <- array(c(center, center - width, center + 
                        width, pvals), c(length(width), 4L), dnames)
  }
  class(out) <- c("TukeyHSD", "multicomp")
  attr(out, "orig.call") <- x$call
  attr(out, "conf.level") <- conf.level
  attr(out, "ordered") <- ordered
  out
}

将上述函数发送到您的 R 控制台,然后在您的数据上进行测试。 这是使用“warpbreaks”示例数据集的输出。

fm1 <- aov(breaks ~ wool * tension, data = warpbreaks)

> TukeyHSD(fm1, which="wool:tension")

  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = breaks ~ wool * tension, data = warpbreaks)

$`wool:tension`
               diff       lwr        upr     p adj
B:L-A:L -16.3333333 -31.63966  -1.027012 0.0302143
A:M-A:L -20.5555556 -35.86188  -5.249234 0.0029580
B:M-A:L -15.7777778 -31.08410  -0.471456 0.0398172
A:H-A:L -20.0000000 -35.30632  -4.693678 0.0040955
B:H-A:L -25.7777778 -41.08410 -10.471456 0.0001136
A:M-B:L  -4.2222222 -19.52854  11.084100 0.9626541
B:M-B:L   0.5555556 -14.75077  15.861877 0.9999978
A:H-B:L  -3.6666667 -18.97299  11.639655 0.9797123
B:H-B:L  -9.4444444 -24.75077   5.861877 0.4560950
B:M-A:M   4.7777778 -10.52854  20.084100 0.9377205
A:H-A:M   0.5555556 -14.75077  15.861877 0.9999978
B:H-A:M  -5.2222222 -20.52854  10.084100 0.9114780
A:H-B:M  -4.2222222 -19.52854  11.084100 0.9626541
B:H-B:M -10.0000000 -25.30632   5.306322 0.3918767
B:H-A:H  -5.7777778 -21.08410   9.528544 0.8705572

这是修改版本的输出,其中“match”设置为 TRUE。

> TukeyHSD2(fm1, which="wool:tension", match=TRUE)

  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = breaks ~ wool * tension, data = warpbreaks)

$`wool:tension`
               diff       lwr       upr     p adj
B:L-A:L -16.3333333 -31.63966 -1.027012 0.0302143
A:M-A:L -20.5555556 -35.86188 -5.249234 0.0029580
A:H-A:L -20.0000000 -35.30632 -4.693678 0.0040955
B:M-B:L   0.5555556 -14.75077 15.861877 0.9999978
B:H-B:L  -9.4444444 -24.75077  5.861877 0.4560950
B:M-A:M   4.7777778 -10.52854 20.084100 0.9377205
A:H-A:M   0.5555556 -14.75077 15.861877 0.9999978
B:H-B:M -10.0000000 -25.30632  5.306322 0.3918767
B:H-A:H  -5.7777778 -21.08410  9.528544 0.8705572
© www.soinside.com 2019 - 2024. All rights reserved.