在R中使用配方后如何使变量归一化?

问题描述 投票:1回答:1

我正在使用neuralnet功能训练train,并使用recipes预处理数据。

是否有任何功能可以根据模型做出预测,然后在原始范围内重新缩放比例,在我的情况下为[1, 100]

library(caret)
library(recipes)
library(neuralnet)

# Create the dataset - times table 
tt <- data.frame(multiplier = rep(1:10, times = 10), multiplicand = rep(1:10, each = 10))
tt <- cbind(tt, data.frame(product = tt$multiplier * tt$multiplicand))

# Splitting 
indexes <- createDataPartition(tt$product,
                              times = 1,
                              p = 0.7,
                              list = FALSE)
tt.train <- tt[indexes,]
tt.test <- tt[-indexes,]

# Recipe to pre-process our data
rec_reg <- recipe(product ~ ., data = tt.train) %>%
  step_center(all_predictors()) %>% step_scale(all_outcomes()) %>%
  step_center(all_outcomes()) %>% step_scale(all_predictors())

# Train
train.control <- trainControl(method = "repeatedcv",
                              number = 10,
                              repeats = 3,
                              savePredictions = TRUE)

tune.grid <- expand.grid(layer1 = 8,
                         layer2 = 0,
                         layer3 = 0)

# Setting seed for reproducibility
set.seed(12)
tt.cv <- train(rec_reg,
               data = tt.train,
               method = 'neuralnet',
               tuneGrid = tune.grid,
               trControl = train.control,
               algorithm = 'backprop',
               learningrate = 0.005,
               lifesign = 'minimal')
r machine-learning r-caret
1个回答
2
投票

如果使用step_normalize而不是step_scalestep_center,则可以使用以下功能基于recipe进行“非标准化”。 (如果您希望通过两步进行标准化,则需要调整unnormalize函数。)

此功能用于提取相关步骤。

#' Extract step item
#'
#' Returns extracted step item from prepped recipe.
#'
#' @param recipe Prepped recipe object.
#' @param step Step from prepped recipe.
#' @param item Item from prepped recipe.
#' @param enframe Should the step item be enframed?
#'
#' @export
extract_step_item <- function(recipe, step, item, enframe = TRUE) {
  d <- recipe$steps[[which(purrr::map_chr(recipe$steps, ~ class(.)[1]) == step)]][[item]]
  if (enframe) {
    tibble::enframe(d) %>% tidyr::spread(key = 1, value = 2)
  } else {
    d
  }
}

此功能用于取消标准化。因此它乘以std。偏差并加上平均值。

#' Unnormalize variable
#'
#' Unormalizes variable using standard deviation and mean from a recipe object. See \code{?recipes}.
#'
#' @param x Numeric vector to normalize.
#' @param rec Recipe object.
#' @param var Variable name in the recipe object.
#'
#' @export
unnormalize <- function(x, rec, var) {
  var_sd <- extract_step_item(rec, "step_normalize", "sds") %>% dplyr::pull(var)
  var_mean <- extract_step_item(rec, "step_normalize", "means") %>% dplyr::pull(var)

  (x * var_sd) + var_mean
}

因此您应该能够生成预测,然后使用:

unnormalize(predictions, prepped_recipe_obj, outcome_var_name)

其中predictions是从训练后的模型生成的预测向量,在您的情况下prepped_recipe_objrec_reg,在您的情况下outcome_var_nameproduct

© www.soinside.com 2019 - 2024. All rights reserved.