如何写一个for循环来计算几个问题的Cohen's kappa?

问题描述 投票:0回答:2

我对在 R 中循环很陌生,需要你的帮助。

说一个数据集如下:

dog <- c("Dog1","Dog1","Dog2","Dog2","Dog3","Dog3", "Dog4","Dog4")
rater <- c("A", "K", "A", "K","A", "T","A","M")
q1 <- c(0,0,0,0,0,0,1,0)
q2 <- c(0,1,1,0,0,0,0,0)
q3 <- c(0,0,1,1,1,1,0,0)
data <- data.frame(dog,rater, q1,q2,q3)
data
   dog rater q1 q2 q3
1 Dog1     A  0  0  0
2 Dog1     K  0  1  0
3 Dog2     A  0  1  1
4 Dog2     K  0  0  1
5 Dog3     A  0  0  1
6 Dog3     T  0  0  1
7 Dog4     A  1  0  0
8 Dog4     M  0  0  0
library(tidyverse)

data_wide<- data %>% 
  pivot_longer(cols = 3:5,
                names_to = "question",
                values_to = "response") %>% 
   pivot_wider(names_from = "rater",
               values_from = "response")
data_wide
# A tibble: 12 × 6
   dog   question     A     K     T     M
   <chr> <chr>    <dbl> <dbl> <dbl> <dbl>
 1 Dog1  q1           0     0    NA    NA
 2 Dog1  q2           0     1    NA    NA
 3 Dog1  q3           0     0    NA    NA
 4 Dog2  q1           0     0    NA    NA
 5 Dog2  q2           1     0    NA    NA
 6 Dog2  q3           1     1    NA    NA
 7 Dog3  q1           0    NA     0    NA
 8 Dog3  q2           0    NA     0    NA
 9 Dog3  q3           1    NA     1    NA
10 Dog4  q1           1    NA    NA     0
11 Dog4  q2           0    NA    NA     0
12 Dog4  q3           0    NA    NA     0

在上面的宽数据集中,每只狗有 3 个问题,4 个评分者的回答在

A
K
T
M
列中。

Rater

A
对所有狗进行评分,但其他评分者只对部分狗进行评分,但他们总共对所有狗进行评分。

我的目标是为

A
与所有其他评分者配对的配对计算Cohen的kappa。

这里有一个

for
循环来提高效率。

#define the raters and questions

raters <- c("A", "K", "T", "M")
questions <- c("q1", "q2", "q3")

# create an empty list to store the kappa values--

kappa_list <- list()

# loop through each question and pair of raters to calculate kappa
for (q in questions) {
   for (i in 1:length(raters)) {
     for (j in (i+1):length(raters)) {
       r1 <- raters[i]
       r2 <- raters[j]
       data_subset <- subset(data_wide, question == q)
       data_subset <- data_subset[, c("dog", r1, r2)]
       kappa_val <- kappa2(data_subset[,2:3])
       kappa_list[[paste0(q, "_", r1, "_", r2)]] <- kappa_val
     }
   }
 }

我有两个问题

  1. 当我打印 kappa 值列表时,它只显示一个问题 q1 的成对值。这段代码有什么问题?

  2. 为什么我得到这个错误,但仍然有 kappas 存储在

    kappa_list

Error in rep(0, nc - 1) : invalid 'times' argument

这是名单:

kappa_list
$q1_A_K
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 2 
   Raters = 2 
    Kappa = NaN 

        z = NaN 
  p-value = NaN 

$q1_A_T
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 1 
   Raters = 2 
    Kappa = NaN 

        z = NaN 
  p-value = NaN 

$q1_A_M
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 1 
   Raters = 2 
    Kappa = 0 

        z = NaN 
  p-value = NaN 

任何建议都会很有帮助!

r dataframe for-loop nested-loops nested-for-loop
2个回答
0
投票

我本身并没有回答你的问题,但如果可能的话,使用

base::lapply()
purrr::map()
是比 for 循环更干净的方法:

kappa_across <- function(x, q) {
  data_wide |> 
    filter(question == q) |> 
    select(all_of(c("A", x))) |> 
    kappa2()
}

raters <- c("K", "T", "M")
questions <- c("q1", "q2", "q3")

raters_questions <- expand_grid(raters, questions)

map2(raters_questions$raters, raters_questions$questions, ~ kappa_across(x = .x, q = .y))

[[1]]
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 2 
   Raters = 2 
    Kappa = NaN 

        z = NaN 
  p-value = NaN 

[[2]]
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 2 
   Raters = 2 
    Kappa = -1 

        z = -1.41 
  p-value = 0.157 

[[3]]
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 2 
   Raters = 2 
    Kappa = 1 

        z = 1.41 
  p-value = 0.157 

[[4]]
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 1 
   Raters = 2 
    Kappa = NaN 

        z = NaN 
  p-value = NaN 

[[5]]
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 1 
   Raters = 2 
    Kappa = NaN 

        z = NaN 
  p-value = NaN 

[[6]]
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 1 
   Raters = 2 
    Kappa = NaN 

        z = NaN 
  p-value = NaN 

[[7]]
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 1 
   Raters = 2 
    Kappa = 0 

        z = NaN 
  p-value = NaN 

[[8]]
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 1 
   Raters = 2 
    Kappa = NaN 

        z = NaN 
  p-value = NaN 

[[9]]
 Cohen's Kappa for 2 Raters (Weights: unweighted)

 Subjects = 1 
   Raters = 2 
    Kappa = NaN 

        z = NaN 
  p-value = NaN 

0
投票

也许没有必要付出这样的努力。你可以尝试一下

by
方法。
by
本质上结合了
split
lapply
。如果您对结果进行子集
kappa2(.)[c('value', 'statistic', 'p.value')]
,您可以制作一张漂亮的桌子。

qcol <- grep('^q\\d+$', names(data))  ## find question columns

by(data, data$dog, \(x) {
  data.frame(raters=toString(unique(x$rater)), irr::kappa2(t(x[qcol]))[c('value', 'statistic', 'p.value')])
}) |> do.call(what=rbind)
#      raters value statistic    p.value
# Dog1   A, K   0.0       NaN        NaN
# Dog2   A, K   0.4 0.8660254 0.38647623
# Dog3   A, T   1.0 1.7320508 0.08326452
# Dog4   A, M   0.0       NaN        NaN
# Warning messages:
# 1: In sqrt(varkappa) : NaNs produced
# 2: In sqrt(varkappa) : NaNs produced

注意,

psych
包中还有一个 kappa 函数。

by(data, data$dog, \(x) {
  data.frame(raters=toString(unique(x$rater)), psych::cohen.kappa(t(x[qcol]))[['confid']][1,,drop=F])
}) |> do.call(what=rbind)
#      raters      lower estimate upper
# Dog1   A, K  0.0000000      0.0     0
# Dog2   A, K -0.3681459      0.4     1
# Dog3   A, T  1.0000000      1.0     1
# Dog4   A, M  0.0000000      0.0     0
# Warning message:
# In cohen.kappa1(x, w = w, n.obs = n.obs, alpha = alpha, levels = levels) :
#  upper or lower confidence interval exceed  abs(1)  and set to +/- 1. 
© www.soinside.com 2019 - 2024. All rights reserved.