用于在R中使用for循环计算梯度的函数

问题描述 投票:0回答:1

我正在尝试编写一个函数来计算R中的梯度。该函数必须特别使用for循环来执行此操作。我编写此函数是为了说明在尝试计算梯度时for循环如何不如矢量化编程有效。

该函数接受设计矩阵X和系数β的矢量,以计算成本函数的梯度(设计矩阵是第一列上带有协变量的协变量矩阵。)

成本函数是MSE,

enter image description here

。我正在使用解析函数计算梯度,并采用损失函数的偏导数。这使我们的偏导数为,

enter image description here

对于第一个系数$ \ beta_ {0} $,然后类似地对于所有其他系数$ \ beta_ {j} $,

enter image description here

我设法使用矢量化编程作为下面的代码行来计算答案(实现上述步骤。

-2*t(X)%*%(y-X%*%beta)

我尝试使用for循环,没有用,但是看起来像这样,

  # Initialize the matrix to hold gradient
  gradient = matrix(0, nrow = nrow(beta))

  for(i in 1:nrow(beta)){

    if(i == 1){
      gradient[i] = -2*sum(y - X%*%beta) # first value
    }else if(i>1){
      gradient[i] = -2*sum( t(X)%*%(y - X%*%beta) ) * apply(X[,-1],2,sum)[i-1]  
    } }

下面是生成要使用的数据的代码,以及我在R中尝试过的两个实现。该代码生成数据,并且可以用于测试是否将for循环复制到R中。

# Values
# Random data Generated
n = 4  
p = 3

beta_0  = 2
beta = matrix(c(beta_0,3,1,4), nrow= (p+1) ) # coefficients
X = runif(n=(n*p), min=-5,max= 5) # covariates
X  = matrix(X, nrow = n, ncol = p)
X = cbind(1, X) # make a design matrix
y = apply(X[,-1],1,mean)  # Response (Some function) # 

# Print all to show all initial values 
print(list("Initial Values:"="","n:"=n," p:" = p, "beta_0:" = beta_0," beta:"=beta,
           " X:"=X," y:" = y))



#             Function 1
# Find the gradient (using a for loop)
# The partial derivative of the loss function
df_ols = function(beta,X,y){

  begin.time <- proc.time() 
  # Initialize the matrix to hold gradient
  gradient = matrix(0, nrow = nrow(beta))

  for(i in 1:nrow(beta)){

    if(i == 1){
      gradient[i] = -2*sum(y - X%*%beta) # first value
    }else if(i>1){
      gradient[i] = -2*sum( t(X)%*%(y - X%*%beta) ) * apply(X[,-1],2,sum)[i-1]  
    } }

  end.time <- proc.time()
  time <- (end.time-begin.time)[3]

  print(list("gradient 1"=gradient,"time"=time))
}

df_ols(beta,X,y)

# Function 2
# Find the gradient Approach 2 using vectorized programming
gradient_3 <- function(X, beta){

  begin.time <- proc.time()

  # Finding the gradient
  grad_3 <- -2*t(X)%*%(y-X%*%beta)
  grad_3 <- matrix(grad_3, ncol = 1,nrow = ncol(X)) # Turn into a column matrix
  end.time <- proc.time()

  time <- (end.time-begin.time)[3]

  print(list("gradient 3"= grad_3 ,"time"=time))

}

gradient_3(X, beta) # Assumed Correct

如果我不太罗word,我深表歉意。任何帮助,将不胜感激。

r for-loop math optimization statistics
1个回答
0
投票

我设法使for循环起作用,下面您将看到答案。

# Find the gradient (using a for loop) explicit, element-wise formulations 
# The partial derivative of the loss function
df_ols = function(beta,X,y){

  begin.time <- proc.time() 
  # Initialize the matrix to hold gradient
  gradient = matrix(0, nrow = nrow(beta))

  for(i in 1:nrow(beta)){

    if(i == 1){
      gradient[i] = -2*sum(y - X%*%beta) # first value
    }else if(i>1){
      gradient[i]   =  -2*t(X[,i])%*%(y-X%*%beta)
      #gradient[i] = -2*sum( t(X)%*%(y - X%*%beta) ) * apply(X[,-1],2,sum)[i-1]  
    } }

  end.time <- proc.time()
  time <- (end.time-begin.time)[3]

  print(list("gradient 1"=gradient,"time"=time))
}

df_ols(beta,X,y)

,运行矢量化和元素化地层,我们发现矢量化过程要快得多。我们将在下面执行完整的过程。

# Random data Generated
n = 100000*12  
p = 3

beta_0  = 2
beta = matrix(c(beta_0,3,1,4), nrow= (p+1) ) # coefficients
X = runif(n=(n*p), min=-5,max= 5) # covariates
X  = matrix(X, nrow = n, ncol = p)
X = cbind(1, X) # make a design matrix
y = apply(X[,-1],1,mean)  # Response (Some function) 

# Print parameters. To show all initial values 
print(list("Initial Values:" = '',"n:"=n," p:" = p, "beta_0:" = beta_0," beta:"=beta,
           " X:"=round(X,digits=2)," y:" = round(y,digits=2)))


# Function 3
# Find the gradient Approach 3, using vectorized programming
gradient_3 <- function(X, beta,y){

  begin.time <- proc.time()

  # Find the partial derivative of the rest (j'th)
  grad_3 <- -2*t(X)%*%(y-X%*%beta)
  grad_3 <- matrix(grad_3, ncol = 1,nrow = ncol(X)) # Turn into a column matrix
  end.time <- proc.time()

  time <- (end.time-begin.time)[3]

  print(list("gradient 3"= grad_3 ,"time"=time))
}

显示结果

> df_ols(beta,X,y) # Elementwise
$`gradient 1`
         [,1]
[1,]  4804829
[2,] 53311879
[3,] 13471077
[4,] 73259191

$time
elapsed 
   3.59 
> gradient_3(X, beta,y) # Vectorized Programming solution
$`gradient 3`
         [,1]
[1,]  4804829
[2,] 53311879
[3,] 13471077
[4,] 73259191

$time
elapsed 
   0.89 
© www.soinside.com 2019 - 2024. All rights reserved.