我的数据库包括78个国家,但由于我需要将数据库的列 "代码 "与地图的列 "代码 "相匹配(r的世界地图包),我不得不将所有国家都包括在内,并在没有数据的地方写入新农合。
# A tibble: 241 x 12
code country sales2010 gdp gdppc population Export Import tradecost skills
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 ABW Aruba NA NA NA NA NA NA NA NA
2 AFG Afghan… NA NA NA NA NA NA NA NA
3 AGO Angola NA NA NA NA NA NA NA NA
4 AIA Anguil… NA NA NA NA NA NA NA NA
5 ALB Albania NA NA NA NA NA NA NA NA
6 ALD Aland NA NA NA NA NA NA NA NA
7 AND Andorra NA NA NA NA NA NA NA NA
8 ARE United… NA NA NA NA NA NA NA NA
9 ARG Argent… 44287 4.24e11 10386. 40788453 8.11e10 6.88e10 2.83 9.48
10 ARM Armenia 4 9.26e 9 3218. 2877319 2.21e 9 4.54e 9 1.37 10.9
# … with 231 more rows, and 2 more variables: investmentcost <dbl>, distance <dbl>
然而,这给我带来了很多问题。我使用命令zero.policy=TRUE创建了一个加权矩阵,以考虑到没有邻国的国家,并使用na.省略()成功地进行了莫兰测试。然而,当我运行 moran.plot 命令时,它返回的错误是 na.省略(database$sales2010) 不是一个向量。
> moran.test(na.omit(database$sales2010), PPV3.w, zero.policy=TRUE)
Moran I test under randomisation
data: na.omit(database$sales2010)
weights: PPV3.w
omitted: 1, 2, 3, 4, 5, 6, 7, 8, 11, 12, 13, 14, 15, 19, 21, 22, 23, 25, 27, 28, 30, 31, 34, 35, 36, 38, 44, 46, 47, 49, 50, 51, 52, 53, 54, 55, 59, 60, 61, 63, 64, 66, 69, 71, 72, 74, 75, 78, 79, 81, 83, 84, 85, 86, 87, 89, 90, 91, 92, 98, 100, 101, 103, 105, 109, 112, 113, 115, 116, 117, 118, 120, 122, 123, 126, 127, 128, 129, 133, 134, 136, 137, 138, 139, 141, 142, 143, 145, 146, 147, 148, 149, 150, 151, 152, 155, 156, 157, 158, 159, 160, 161, 164, 165, 167, 168, 170, 173, 174, 176, 177, 180, 181, 182, 183, 185, 186, 189, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 207, 208, 209, 210, 211, 212, 213, 215, 216, 217, 218, 220, 222, 223, 224, 225, 226, 227, 228, 229, 230, 232, 233, 235, 236, 237, 240 n reduced by no-neighbour observations
Moran I statistic standard deviate = 2.205, p-value = 0.01373
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
0.25571193 -0.01612903 0.01519935
> ### Moran diagram
> moran.plot(na.omit(database$sales2010), PPV3.w, zero.policy = TRUE)
Error in moran.plot(na.omit(database$sales2010), PPV3.w, :
is.vector(x) is not TRUE
这就是na.省略(database$sales2010)的样子。
[1] 44287 4 185329 20222 2019 130775 1123 6584 38 994 187351 312 600161 275761 32645 303281 1275 1642
[19] 24004 1781 18636 363995 5960 14952 100793 446 10643 211677 556 658153 1159 2 4762 5057 556 291
[37] 19039 34144 65621 271794 209 18539 131316 987 756 303618 658 107154 6531 1537 4311 316 28923 209
[55] 1955 227473 1411 469 57286 242155 51486 13592 10907 13722 21190 40124 13935 1790 41474 26593 1 152
[73] 32332 57996 7505 40122 23803 3069 940 36411 168
attr(,"na.action")
[1] 1 2 3 4 5 6 7 8 11 12 13 14 15 19 21 22 23 25 27 28 30 31 34 35 36 38 44 46 47 49 50 51
[33] 52 53 54 55 59 60 61 63 64 66 69 71 72 74 75 78 79 81 83 84 85 86 87 89 90 91 92 98 100 101 103 105
[65] 109 112 113 115 116 117 118 120 122 123 126 127 128 129 133 134 136 137 138 139 141 142 143 145 146 147 148 149 150 151 152 155
[97] 156 157 158 159 160 161 164 165 167 168 170 173 174 176 177 180 181 182 183 185 186 189 191 192 193 194 195 196 197 198 199 200
[129] 201 202 203 204 205 207 208 209 210 211 212 213 215 216 217 218 220 222 223 224 225 226 227 228 229 230 232 233 235 236 237 240
attr(,"class")
[1] "omit"
如果我使用的数据库已经排除了NAs,它将返回x与权重矩阵的长度不一样,我应该如何解决这个问题?
谢谢您
事实上 moran.plot
需要 is.vector
也许是没有必要的,还有其他的方法来处理它。说了这么多,你需要做的事情除了是 na.omit
来处理缺失的数值。
is.vector(na.omit(1))
# [1] TRUE
na.omit(1)
# [1] 1
is.vector(na.omit(c(1,NA)))
# [1] FALSE
na.omit(c(1,NA))
# [1] 1
# attr(,"na.action")
# [1] 2
# attr(,"class")
# [1] "omit"
也许
my.na.omit <- function(z) z[!is.na(z)]
is.vector(my.na.omit(1))
# [1] TRUE
is.vector(my.na.omit(c(1,NA)))
# [1] TRUE
my.na.omit(c(1,NA))
# [1] 1