当我尝试在嵌套函数中使用
%dofuture%
进行并行计算时,遇到变量范围问题。
这是我收到的错误消息:
eval 中的错误(引用({:未找到对象“opt”
这是我的代码(对象
opt
被local_var1
替换):
myFunction1 = function(x){
y = x + 1
return(y)
}
myFunction2 = function(x, y){
z = x + y
return(z)
}
myFunction3 = function(function_var4, function_var5, function_var6){
# Claim some local variables
local_var1 = vector("list", length = ncol(function_var4))
local_var2 = vector("list", length = ncol(function_var4))
local_var3 = function_var5 %>% pull(function_var6)
local_var4 = data.frame(Var = seq(min(local_var3), max(local_var3), length.out = 10000))
# Do some parallel calculation
plan(multisession, workers = parallel::detectCores() - 2)
foreach (i = 1:ncol(function_var4)) %dofuture% {
data_glm = data.frame(Var = local_var3,
PreAbs = function_var4[,i])
mod_glm = glm(PreAbs ~ poly(Var, 3), family = binomial, data = data_glm)
# Result of the calculation
local_local_var1 = predict(mod_glm, newdata = local_var4, se = F, type = "response")
# Some simple calculation using local_local_var1
# Save the result
local_var1[[i]] = mean(local_local_var1) # <---- I guess this cause the error
local_var2[[i]] = myFunction1(local_local_var1)
}
# Close multisession workers by switching plan
plan(sequential)
local_var5 = myFunction2(local_local_var1, function_var5)
return(list(opt = unlist(local_var1),
nw = unlist(local_var2),
miv = unlist(local_var5)))
}
运行代码时收到错误消息:
library(foreach)
library(doFuture)
library(dplyr)
global_var1 = matrix(sample(c(0, 1), size = 10000, replace = T), ncol = 100) %>%
as.data.frame()
global_var2 = data.frame(C1 = rnorm(100))
global_var3 = "C1"
result = myFunction3(function_var4 = global_var1, function_var5 = global_var2, function_var6 = global_var3)
我猜这个错误是因为
%dofuture%
只能从全局环境中获取变量和设置,但无法从本地环境中获取变量和设置。有什么办法可以解决这个问题吗?
对我来说,这看起来像是
doFuture
中的一个错误。 %dofuture%
运算符确实记录了调用它的环境,并将其传递给 doFuture:::doFuture2
,后者实际上完成了所有工作,但在整个过程中的某个地方,环境并未被使用。 Henrik Bengtsson(doFuture
的作者)知道他在做什么,因此如果您在 https://github.com/HenrikBengtsson/doFuture/issues 将其作为“问题”向他报告,他可能会解决此问题。然而,这可能是foreach
中的设计缺陷,doFuture
无法解决。
我尝试构建一个解决方法,这是一种运行代码而不会出现您看到的错误消息的方法。通过将以下内容更改为
myFunction3
在 foreach
循环中执行的代码放入本地函数中。您还需要在该函数中使用“超级赋值”(即 <<-
)来对 myFunction3
变量进行赋值:
myFunction3 = function(function_var4, function_var5, function_var6){
# Claim some local variables
local_var1 = vector("list", length = ncol(function_var4))
local_var2 = vector("list", length = ncol(function_var4))
local_var3 = function_var5 %>% pull(function_var6)
local_var4 = data.frame(Var = seq(min(local_var3), max(local_var3), length.out = 10000))
# Do some parallel calculation
plan(multisession, workers = parallel::detectCores() - 2)
local_local_var1 <- NULL
loopcode <- function(i) {
data_glm = data.frame(Var = local_var3,
PreAbs = function_var4[,i])
mod_glm = glm(PreAbs ~ poly(Var, 3), family = binomial, data = data_glm)
# Result of the calculation
local_local_var1 <<- predict(mod_glm, newdata = local_var4, se = F, type = "response")
# Some simple calculation using local_local_var1
# Save the result
local_var1[[i]] <<- mean(local_local_var1) # <---- I guess this cause the error
local_var2[[i]] <<- myFunction1(local_local_var1)
}
foreach (i = 1:ncol(function_var4)) %dofuture% loopcode(i)
# Close multisession workers by switching plan
plan(sequential)
local_var5 = myFunction2(local_local_var1, function_var5)
return(list(opt = unlist(local_var1),
nw = unlist(local_var2),
miv = unlist(local_var5)))
}
但是,仍然存在问题,因为该函数的返回始终为空:
$opt
NULL
$nw
NULL
$miv
numeric(0)
我不太了解
foreach
或 doFuture
软件包,不知道是否可以修复此问题。