如何在r中使用dplyr进行多个实例的距离计算

问题描述 投票:-1回答:1

我正在尝试计算每个时间点相似xy坐标数据的多次迭代之间的距离总和。下面是我的数据框的一个较小示例,其中time指的是时间上的每个实例。在此示例中,ref是具有10个不同点的参考点。 xy是x和y坐标,V5 V7 V9等是x坐标略有更改,V6 V8 V10等是y坐标略有更改。

   time ref     x     y    V5    V6    V7    V8    V9   V10   V11   V12   V13   V14
1     1   1 92.80 49.58 92.79 49.40 94.17 49.77 95.70 47.88 93.54 49.92 96.88 46.73
2     1   2 90.20 96.02 88.22 94.81 89.13 94.65 88.49 91.96 86.44 97.20 91.64 96.62
3     1   3 91.61 80.05 95.94 77.34 90.59 79.64 91.72 79.78 90.32 77.44 89.43 80.44
4     1   4 68.75 20.56 66.21 20.53 70.74 17.46 67.22 20.67 72.66 16.75 67.93 21.96
5     1   5  5.53 35.27  4.09 31.10  9.53 33.71  8.20 33.43  7.29 33.98  8.33 31.73
6     1   6 39.85 85.39 39.01 83.27 38.46 83.64 39.35 82.84 41.78 83.19 40.80 83.86
7     1   7 12.04 87.43  9.34 84.31 13.21 87.53 11.18 84.45  9.77 86.77 10.48 83.96
8     1   8 42.98 56.53 41.26 57.50 40.95 57.87 45.18 53.61 40.71 55.47 44.86 53.46
9     1   9 19.14 63.56 17.69 61.97 22.57 62.42 20.96 60.08 15.48 64.84 18.96 61.49
10    1  10 25.72  7.62 28.02  6.96 24.71  4.80 24.72  5.27 25.59  6.68 27.92  5.03
11    2   1 50.39  7.16 53.64  4.93 47.98  2.60 48.08  5.70 51.46  7.13 50.66  6.12
12    2   2 17.71  7.15 18.16  4.28 14.30  5.22 16.28  3.39 13.16  4.29 16.26  6.95
13    2   3 52.96 34.87 52.04 32.70 53.05 35.25 55.59 33.10 51.93 31.08 57.31 32.15
14    2   4 52.70 97.07 56.55 96.57 50.33 96.19 51.38 94.87 53.67 93.12 52.54 93.27
15    2   5 70.88 44.88 73.14 41.41 70.26 45.54 70.96 42.79 75.10 44.92 70.97 46.13
16    2   6 32.12 71.82 31.29 67.31 30.87 70.14 34.89 68.69 29.01 71.20 32.25 70.00
17    2   7 24.15 22.77 23.10 21.85 23.87 20.69 22.68 19.05 24.59 23.85 23.60 22.00
18    2   8 18.06 31.03 18.72 28.30 20.96 32.23 18.50 28.26 16.49 27.30 13.36 28.47
19    2   9 70.55 92.42 70.43 89.85 71.21 91.35 70.30 92.83 67.93 93.11 71.88 88.02
20    2  10 45.05 79.67 44.07 80.63 44.93 80.74 42.65 79.90 47.02 76.84 45.70 78.57
21    3   1 92.90 10.39 94.74 11.12 93.49  9.85 94.79  8.50 94.95  7.01 91.50 10.62
22    3   2 43.21 25.53 45.67 25.14 41.24 21.98 43.72 22.23 46.17 22.80 41.37 24.07
23    3   3  3.75 71.02  5.17 68.52  4.31 68.96  2.39 70.67  0.21 69.81  1.31 70.91
24    3   4  6.56 55.02 11.37 53.99  7.26 54.97  8.33 55.18  5.33 54.24  9.02 54.67
25    3   5 51.24 89.15 51.33 86.26 53.48 88.12 54.53 85.07 51.13 89.84 48.46 84.90
26    3   6 58.85 86.62 60.19 90.27 58.54 82.72 56.04 83.45 54.73 89.74 59.52 86.68
27    3   7 78.57 81.75 78.36 79.88 81.35 78.37 77.74 81.40 80.76 80.35 80.48 85.27
28    3   8 43.81 38.80 42.64 35.37 43.66 39.98 42.92 40.60 46.64 39.37 43.44 38.49
29    3   9 34.14 19.08 34.48 22.24 34.84 17.16 34.83 19.42 30.28 22.54 34.46 21.36
30    3  10 21.14 44.80 18.40 46.41 22.37 43.11 21.16 43.39 23.22 45.71 22.30 47.19
> 

我希望我的代码输出类似于下表的内容。每行表示不同的时间,每列表示每个xy坐标的每次迭代或模拟的距离之和(x和y分别为xy1,V5和V6分别为xy2,依此类推)

time    xy1      xy2    xy3    xy4     xy5     xy6
1    2706.59 2722.07 2693.55 2670.57 2738.82 2721.5
2    2275.05 2319.26 2312.06 2313.21 2318.96 2283.16
3    2379.67 2337.95 2372.9 2357.32  2456.4  2357.16

为此,我试图循环执行以下函数,该函数每次将第三和第四列分别重命名为x和y,然后计算每个点之间的距离,然后计算所有距离的总和除以2(因为每个距离为测量两次)-此代码似乎适用于我的单个x和y数据

sumdist = function(data) {
  names(data)[3]<-paste("x")
  names(data)[4]<-paste("y")
  data = data %>% 
    group_by(time) %>% 
    mutate(dist1 = sqrt((x[which(ref == 1)] - x)^2 + (y[which(ref == 1)] - y)^2)) %>% 
    mutate(dist2 = sqrt((x[which(ref == 2)] - x)^2 + (y[which(ref == 2)] - y)^2)) %>% 
    mutate(dist3 = sqrt((x[which(ref == 3)] - x)^2 + (y[which(ref == 3)] - y)^2)) %>% 
    mutate(dist4 = sqrt((x[which(ref == 4)] - x)^2 + (y[which(ref == 4)] - y)^2)) %>% 
    mutate(dist5 = sqrt((x[which(ref == 5)] - x)^2 + (y[which(ref == 5)] - y)^2)) %>% 
    mutate(dist6 = sqrt((x[which(ref == 6)] - x)^2 + (y[which(ref == 6)] - y)^2)) %>% 
    mutate(dist7 = sqrt((x[which(ref == 7)] - x)^2 + (y[which(ref == 7)] - y)^2)) %>% 
    mutate(dist8 = sqrt((x[which(ref == 8)] - x)^2 + (y[which(ref == 8)] - y)^2)) %>% 
    mutate(dist9 = sqrt((x[which(ref == 9)] - x)^2 + (y[which(ref == 9)] - y)^2)) %>% 
    mutate(dist10 = sqrt((x[which(ref == 10)] - x)^2 + (y[which(ref == 10)] - y)^2)) %>% 
    summarise(sumdistances = (sum(dist1,dist2,dist3,dist4,dist5,dist6,dist7,dist8,dist9,dist10))/2) 
  print(data$sumdistances)
}

这是失败的地方,当我尝试通过此过程循环每个xy坐标列并重新组合它们时

集合矩阵

sumd = matrix(NA, nrow=3, ncol=6)


for(i in 1:6) {
  datas = df[,c(1,2,(1+(2*i)),(2+(2*i)))]
  sumd[i] = sumdist(datas)
}

我似乎设法将一些数据放入矩阵中,所以我感到自己很亲密,并且主要通过删除代码的不同方面来修改代码,但是我多次收到以下错误消息] >

Warning messages:
1: In sumd[i] <- sumdist(datas) :
  number of items to replace is not a multiple of replacement length
    

我正在尝试计算每个时间点相似xy坐标数据的多次迭代之间的距离总和。下面是我的数据框的一个较小示例,其中时间是指...

r for-loop dplyr simulation euclidean-distance
1个回答
0
投票

sumdist函数中,您的代码无法正常运行:

© www.soinside.com 2019 - 2024. All rights reserved.