晚上好,我想计算附图中两个向量
(O_Temp, O_Salin)
和(D_Temp, D_Salin)
的每个观察值之间的马哈拉诺比斯距离。
因此,目标是计算所有观测值(超过 300 万个观测值)的列向量
(O_Temp, O_salin)
和列向量 (D_Temp, D_Salin)
之间的马哈拉诺比斯距离。
我已经尝试过配对马哈拉诺比斯,但不起作用。你能帮忙吗?
mahalanobis()
函数。原始操作方法来自R-bloggers
# example dataset
df <- data.frame(
O_Temp = c(20.976270078546495, 24.30378732744839, 22.055267521432878, 20.897663659937937, 18.473095986778095),
O_Salin = c(31.403935408357594, 28.772030269246407, 39.767476761184525, 22.04089621496056, 24.177535121896696),
D_Temp = c(23.556330735924604, 15.400159463843297, 24.703880442451897, 29.243770902348764, 14.97506287039916),
D_Salin = c(22.988966093159874, 37.36252114736428, 23.249858693527496, 32.311191285676884, 22.476399656988832)
)
observations <- data.frame(Temperature = df$O_Temp - df$D_Temp,
Salinity = df$O_Salin - df$D_Salin)
mean_vector <- colMeans(observations)
cov_matrix <- cov(observations)
df$Mahalanobis_Distance <- apply(observations, 1, function(x) mahalanobis(x, center = mean_vector, cov = cov_matrix))
df$pvalue <- pchisq(df$Mahalanobis_Distance, df=2, lower.tail = FALSE)
df
O_Temp O_Salin D_Temp D_Salin Mahalanobis_Distance pvalue
1 20.97627 31.40394 23.55633 22.98897 0.4343278 0.8047981
2 24.30379 28.77203 15.40016 37.36252 2.3763608 0.3047753
3 22.05527 39.76748 24.70388 23.24986 1.7669273 0.4133487
4 20.89766 22.04090 29.24377 32.31119 3.0919451 0.2131045
5 18.47310 24.17754 14.97506 22.47640 0.3304390 0.8477076