嗨,我正在尝试学习R中的向量化。
我有以下代码:
set.seed(23)
obs_num=100
Observation=seq(1,obs_num)
Location_Type1=sample(1:2, obs_num, replace=T)
Location_Type2=sample(1:2, obs_num, replace=T)
# The above does not lead to any errors
#Location_Type2=sample(1, obs_num, replace=T)
##Error occurs when I use this formula instead.
low_bound = runif(obs_num,0,1)
mean = runif(obs_num,10,15)
df1= data.frame(Observation,Location_Type1,Location_Type2,mean,low_bound)
Vectorized_function=function(data){
#Create groups
i1= data[["Location_Type1"]] == 1 & data[["Location_Type2"]] == 1
i2= data[["Location_Type1"]] == 2 & data[["Location_Type2"]] == 1
i3= data[["Location_Type1"]] == 1 & data[["Location_Type2"]] == 2
i4= data[["Location_Type1"]] == 2 & data[["Location_Type2"]] == 2
#Draw values
data[i1, "draw_value"] <- rtruncnorm(sum(i1),a=data[i1,'low_bound'],mean = data[i1, "mean"])
data[i2, "draw_value"] <- rtruncnorm(sum(i2),a=data[i2,'low_bound'],mean = data[i2, "mean"])
data[i3, "draw_value"] <- rtruncnorm(sum(i3),a=data[i3,'low_bound'],mean = data[i3, "mean"])
data[i4, "draw_value"] <- rtruncnorm(sum(i4),a=data[i4,'low_bound'],mean = data[i4, "mean"])
data
}
getvalue = Vectorized_function(data=df1)
在df1中,有两列Location_Type1和Location_Type2都可以取值1或2。当存在四种组合时,以上代码将起作用。
a)Location_Type1 = 1&Location_Type2 = 1;
b)Location_Type1 = 1&Location_Type2 = 2;
c)Location_Type1 = 2&Location_Type2 = 1
d)Location_Type1 = 2&Location_Type2 = 2]
我正在尝试根据上述四个条件从截断的正态分布中提取。在我的实际数据中,这可能并不总是发生。
为了复制这种情况,假设我们在上面的代码中更改了以下行,
Location_Type2=sample(1, obs_num, replace=T) #This implies LocatioN_Type2 is only one type
在这种情况下,我收到一条错误消息:
rtruncnorm(sum(i3)中的错误,a = data [i3,“ low_bound”],平均值= data [i3 ,:length(a)> 0不是TRUE
我可以看到发生了什么。本质上,不存在满足条件i3和i4(即sum(i3)和sum(i4)= 0)的任何观察结果。在这种情况下,下限部分(代码中的“ a”)会引起问题。
有人可以建议如何确保我可以在代码中处理这些情况。我希望向量化函数能够处理任何条件为空的情况。
[@ akrun的注释之后,我对该函数进行了如下调整:
Vectorized_function=function(data){
#Create groups
i1= data[["Location_Type1"]] == 1 & data[["Location_Type2"]] == 1
i2= data[["Location_Type1"]] == 2 & data[["Location_Type2"]] == 1
i3= data[["Location_Type1"]] == 1 & data[["Location_Type2"]] == 2
i4= data[["Location_Type1"]] == 2 & data[["Location_Type2"]] == 2
#Draw values
data[i1, "draw_value"] <- try(rtruncnorm(sum(i1),a=data[i1,'low_bound'],mean = data[i1, "mean"]),silent = T)
data[i2, "draw_value"] <- try(rtruncnorm(sum(i2),a=data[i2,'low_bound'],mean = data[i2, "mean"]),silent = T)
data[i3, "draw_value"] <- try(rtruncnorm(sum(i3),a=data[i3,'low_bound'],mean = data[i3, "mean"]),silent = T)
data[i4, "draw_value"] <- try(rtruncnorm(sum(i4),a=data[i4,'low_bound'],mean = data[i4, "mean"]),silent = T)
data
}
这似乎从现在开始有效,并且可能由于缺少/没有观察而导致错误。