分歧系数 - 如何更有效地编码循环?

问题描述 投票:1回答:1

我有一个1500行和20列的矩阵,想要计算列之间的发散系数(即站点)。每行是我每个站点的每小时观察。

两个站点之间的COD计算如https://www.scribd.com/document/364585909/Coefficient-of-Divergence-COD中所述

我创建了一个更简单的数据集来使用:

x1 <- c(10, 20, 30, 40, 50)
x2 <- c(11, 21, 31, 41, 51)
x3 <- c(12, 22, 32, 42, 52)
x4 <- c(13, 23, 33, 43, 53)
x5 <- c(14, 24, 34, 44, 54)
x6 <- c(15, 25, 35, 44, 55)

xmat <- cbind(x1,x2,x3,x4,x5,x6)

first COD

site1 <- xmat

for (i in 2:ncol(site1))   

     site1[,i] = (xmat[,i]-xmat[,1])/(xmat[,i]+xmat[,1])
     site1 <- site1[,2:6]
     site1 <- sapply(site1, function(x) x^2)
     COD-site1 <- data.frame(sqrt(colMeans(site1, na.rm=TRUE)))

第二个COD

site2 <- xmat

for (i in 3:ncol(site2))

    site2[,i] = (xmat[,i]-xmat[,2])/(xmat[,i]+xmat[,2])
    site2 <- site2[,3:6]
    site2 <- sapply(site2, function(x) x^2)
    COD-site2 <- data.frame(sqrt(colMeans(site2, na.rm=TRUE)))

等等....

我遇到了如何循环遍历列并为COD创建新数据框的问题。在我的简单数据集中,它超过6列,但在我的实际数据集中,我有20列,并不真正想要复制和粘贴20次。我是R的函数和循环的新手,并且非常感谢贷款。非常感谢

r function loops matrix
1个回答
1
投票

对于矩阵对象,可以在没有循环的情况下处理相同大小的元素化操作,例如forsapply,您可以在向量化过程中直接对整个对象运行算术:

网站1

site1 <- xmat
site1[,2:ncol(site1)] <- (xmat[,2:ncol(site1)]-xmat[,1])/(xmat[,2:ncol(site1)]+xmat[,1])
site1 <- site1[,2:6]^2
COD_site1 <- data.frame(result=sqrt(colMeans(site1, na.rm=TRUE)))
COD_site1

网站2

site2 <- xmat
site2[,3:ncol(site2)] <- (xmat[,3:ncol(site2)]-xmat[,2])/(xmat[,3:ncol(site2)]+xmat[,2])
site2 <- site2[,3:6]^2
COD_site2 <- data.frame(result=sqrt(colMeans(site2, na.rm=TRUE)))
COD_site2

要迭代,概括您的处理并在像lapply这样的循环中运行,以构建返回值列表。让循环迭代原始矩阵的列数减去第一项(即2,3,4,5,6):

result_list <- lapply(seq(ncol(xmat))[-1], function(i) {

  tryCatch({
     site <- xmat
     site[, i:ncol(site)] <- (xmat[,i:ncol(site)]-xmat[,i-1])/(xmat[,i:ncol(site)]+xmat[,i-1])
     site <- as.matrix(site[,i:ncol(xmat)]^2)
     COD_site <- data.frame(result=sqrt(colMeans(site, na.rm=TRUE)))

     # RETURN LIST OF BOTH ITEMS
     return(list(site, COD_site))

   }, error = function(e) list(NA, NA)
  )

})

甚至根据网站的名单元素:

names(df_list) <- paste0("COD_site", seq(ncol(xmat)-1))

甚至将列表提取到自己的类型中:

mat_lists <- lapply(result_list, `[[`, 1)

df_lists <- lapply(result_list, `[[`, 2)

mat_lists

mat_lists
# $COD_site1
#                x2           x3           x4          x5          x6
# [1,] 0.0022675737 0.0082644628 0.0170132325 0.027777778 0.040000000
# [2,] 0.0005948840 0.0022675737 0.0048674959 0.008264463 0.012345679
# [3,] 0.0002687450 0.0010405827 0.0022675737 0.003906250 0.005917160
# [4,] 0.0001524158 0.0005948840 0.0013064305 0.002267574 0.002267574
# [5,] 0.0000980296 0.0003844675 0.0008483363 0.001479290 0.002267574

# $COD_site2
#                x3           x4           x5           x6
# [1,] 8.218953e-05 5.787037e-04 1.728000e-03 3.641329e-03
# [2,] 1.257751e-05 9.391435e-05 2.962963e-04 6.575162e-04
# [3,] 3.999248e-06 3.051758e-05 9.831589e-05 2.226118e-04
# [4,] 1.748903e-06 1.349746e-05 4.396499e-05 4.396499e-05
# [5,] 9.151417e-07 7.111971e-06 2.332362e-05 5.373563e-05

# $COD_site3
#                x4           x5           x6
# [1,] 2.560000e-06 3.501278e-05 1.524158e-04
# [2,] 2.438653e-07 3.573458e-06 1.659945e-05
# [3,] 5.602045e-08 8.432265e-07 4.019627e-06
# [4,] 1.915686e-08 2.925002e-07 2.925002e-07
# [5,] 8.227025e-09 1.267350e-07 6.179451e-07

# $COD_site4
#                x5           x6
# [1,] 6.969172e-08 1.859344e-06
# [2,] 4.360243e-09 1.255867e-07
# [3,] 7.406721e-10 2.200926e-08
# [4,] 2.006336e-10 2.006336e-10
# [5,] 7.129862e-11 2.177866e-09

# $COD_site5
#              [,1]
# [1,] 1.681171e-09
# [2,] 7.224762e-11
# [3,] 9.266281e-12
# [4,] 0.000000e+00
# [5,] 5.962673e-13

df_lists

df_lists
# $COD_site1
#        result
# x2 0.02600634
# x3 0.05010383
# x4 0.07253009
# x5 0.09348300
# x6 0.11206961

# $COD_site2
#         result
# x3 0.004504006
# x4 0.012031168
# x5 0.020927975
# x6 0.030394597

# $COD_site3
#          result
# x4 0.0007599039
# x5 0.0028230728
# x6 0.0058982253

# $COD_site4
#          result
# x5 0.0001225272
# x6 0.0006339273

# $COD_site5
#         result
# 1 1.877915e-05
© www.soinside.com 2019 - 2024. All rights reserved.