pmap_dfr 在较大函数内部找不到在较大函数内创建的对象

问题描述 投票:0回答:1

我有2个功能。主要函数引入数据,对其进行操作,将其传递给辅助函数(使用 pmap_dfr)进行更多处理,获取这些值并返回结果。 唯一的问题是 pmap_dfr 似乎找不到我在该主函数中创建的对象。我猜这是一个范围问题,但如果我不想将处理后的数据弹出到全局环境中,我真的不知道如何解决它。有什么建议吗?

代表

df1 <- data.frame(x = rnorm(100),
                  y = rnorm(100, 1, .3),
                  z = rnorm(100, 15, 3))

#secondary function
measures <- function(data_name = "df2", col_name_diff = "diff_xy"){
  dataf <- eval(sym(data_name))

  mean = (sum(dataf[col_name_diff], na.rm = T))/nrow(dataf)
  variance = sd(unlist(dataf[col_name_diff]), na.rm = T)^2
  
  x <- data.frame(mean = mean, 
                   variance = variance)  
  
  x
}

#primary function
aggregate_results <- function(dataset_name = "df1"){
  dataf <- eval(sym(dataset_name))
  
  df2 <- dataf %>% 
    mutate(diff_xy = x-y,
           diff_yx = y-x,
           diff_yz = y-z)
 
  data_name <- "df2"
  col_name_diff <- df2 %>% select(contains("diff")) %>% names
  params <- crossing(data_name, col_name_diff)
  
  results <- pmap_dfr(.f = measures, .l = params)
}

aggregate_results()

#get
Error: object 'df2' not found

#want
name       mean variance
1 diff_xy  -1.019687 1.164101
2 diff_yx   1.019687 1.164101
3 diff_yz -14.237093 9.755626```
r function pmap
1个回答
0
投票

通常传递对象本身而不是其名称,但如果有充分的理由传递其名称,那么也应该传递保存该名称的环境,因为否则函数中引用的对象将在该环境中查找函数已定义,而不是在调用者中定义。标记为##的行已被添加和修改。如此标记的第一行是为了使代码可重现。

library(dplyr); library(purrr); library(tidyr); set.seed(12) ##

df1 <- data.frame(x = rnorm(100),
                  y = rnorm(100, 1, .3),
                  z = rnorm(100, 15, 3))

measures <- function(data_name = "df2", col_name_diff = "diff_xy", 
    envir = parent.frame()){

  dataf <- get(data_name, envir) ##
  mean = (sum(dataf[col_name_diff], na.rm = T))/nrow(dataf)
  variance = sd(unlist(dataf[col_name_diff]), na.rm = T)^2
  
  x <- data.frame(mean = mean, 
                   variance = variance)
  x
}

aggregate_results <- function(dataset_name = "df1", envir = parent.frame()){

  dataf <- get(dataset_name, envir) ##
  df2 <- dataf %>% 
    mutate(diff_xy = x-y,
           diff_yx = y-x,
           diff_yz = y-z)
       data_name <- "df2"
  col_name_diff <- df2 %>% select(contains("diff")) %>% names
  params <- crossing(data_name, col_name_diff)
  
  results <- pmap_dfr(.f = measures, .l = params, envir = environment()) ##
}

aggregate_results()
© www.soinside.com 2019 - 2024. All rights reserved.