让我开始说我对R完全陌生,并试图弄清楚如何在我的特定数据集上运行icc,这可能与通常情况有所不同。
数据集如下所示
+------------+------------------+--------------+--------------+--------------+
| date | measurement_type | measurement1 | measurement2 | measurement3 |
+------------+------------------+--------------+--------------+--------------+
| 25-04-2020 | 1 | 15.5 | 34.3 | 43.2 |
| 25-04-2020 | 2 | 21.2 | 12.3 | 2.2 |
| 25-04-2020 | 3 | 16.2 | 9.6 | 43.3 |
| 25-04-2020 | 4 | 27 | 1 | 6 |
+------------+------------------+--------------+--------------+--------------+
现在,我想对所有这些行执行icc,因为每一行代表不同的评估者。应该将date
和measurement_type
列留在外面。
有人能指出我正确的方向,我绝对不知道该怎么做。
-------编辑-------我导出了将与一些测试数据一起输出的实际数据集。哪个可用here
这里的两个重要表是第一和第三。第一个包含研究的所有参与者,第三个包含每个参与者的所有4个不同报告。到目前为止,我拥有的代码只是将每个报告绑定到正确的参与者;
library("XLConnect")
library("sqldf")
library("irr")
library("dplyr")
library("tidyr")
# Load in Workbook
wb = loadWorkbook("Measuring.xlsx")
# Load in Worksheet
# Sheet 1 = Study Results
# Sheet 3 = Meetpunten
records = readWorksheet(wb, sheet=1)
reports = readWorksheet(wb, sheet=3)
for (record in 1:nrow(records)) {
recordId = records[record, 'Record.Id']
participantReports = sqldf(sprintf("select * from reports where `Record.Id` = '%s'", recordId))
baselineReport = sqldf("select * from participantReports where measurement_type = '1'")
drinkReport = sqldf("select * from participantReports where measurement_type = '2'")
regularReport = sqldf("select * from participantReports where measurement_type = '3'")
exerciseReport = sqldf("select * from participantReports where measurement_type = '4'")
}
我们可以用pivot_longer
转换为'long'格式,然后应用按row_number分组的icc
library(dplyr)
library(tidyr)
df1 %>%
mutate(rn = row_number()) %>%
pivot_longer(cols = starts_with('measurement')) %>%
group_by(rn) %>%
summarise(ICC = icc(value))