我必须根据 1e6 个点的网格中计算的三个条件对很多农作物进行分类。我正在尝试优化下面的函数(希望不要转向 C 或 Rust)。有什么想法吗?
如有必要,可以重新格式化输入数据。我已经尝试过
data.table
但性能更差。
这是我最好的镜头:
condtion1 <- letters[1:8]
condtion2 <- letters[9:15]
condtion3 <- letters[16:24]
crop <- sample(0:1, 24, replace = T)
names(crop) <- letters[1:24]
n <- 1e6
condtions1 <- sample(condtion1, n, replace = T)
condtions2 <- sample(condtion2, n, replace = T)
condtions3 <- sample(condtion3, n, replace = T)
get_suitability <- function(){
result <- character(n)
for (i in seq_along(result)) {
if (crop[[condtions1[[i]]]] == 0 | crop[[condtions2[[i]]]] == 0) result[[i]] <- "not suitable"
else if(crop[[condtions1[[i]]]] == 1 & crop[[condtions2[[i]]]] == 1 & crop[[condtions3[[i]]]] == 1) result[[i]] <- "suitable"
else if(crop[[condtions1[[i]]]] == 1 & crop[[condtions2[[i]]]] == 1 & crop[[condtions3[[i]]]] == 0) result[[i]] <- "suitable with irrigation"
}
result
}
microbenchmark::microbenchmark(
get_suitability(),
times = 5
)
#> Unit: seconds
#> expr min lq mean median uq max neval
#> get_suitability() 2.402434 2.408322 2.568981 2.641211 2.667943 2.724993 5
创建于 2024-03-24,使用 reprex v2.1.0
这看起来像是
ifelse()
比较合适的情况。例如,这个函数比你的快很多:
get_suitability2 <- function(){
result <- ifelse(crop[condtions1] == 0 |
crop[condtions2] == 0, "not suitable",
ifelse(crop[condtions1] == 1 &
crop[condtions2] == 1 &
crop[condtions3] == 1, "suitable",
ifelse(crop[condtions1] == 1 &
crop[condtions2] == 1 &
crop[condtions3] == 0, "suitable with irrigation", "")))
names(result) <- NULL
result
}
还有可能进行更多改进。您的某些测试是多余的,因此您可以删除它们。在第一次测试确定“不适合”不是答案后,您不需要再次查看
condtions1
或condtions2
:它们已知为1。并且最终测试保证为真。所以你可以简化为
get_suitability3 <- function(){
result <- ifelse(crop[condtions1] == 0 |
crop[condtions2] == 0, "not suitable",
ifelse(crop[condtions3] == 1, "suitable",
"suitable with irrigation"))
names(result) <- NULL
result
}