情况如下:我有一个数据框列表,对于每个数据框,我都有一个列列表,我需要更改其格式。设置:
df1 <- data.frame(a = c("2020-03-02", "2020-12-22", "2020-07-03"), b = c(4, 5, 6), c = c("2020-03-13", "2019-11-03", "2011-05-02"))
df2 <- data.frame(d = c(1, 2, 3), e = c("2020-05-21", "2014-08-31", "1999-01-21"), f = c(7, 8, 9))
datasets <- list("first" = df1, "second" = df2)
dates <- list("first" = c("a", "c"), "second" = c("e"))
一个人可以通过以下方法来做到这一点:1.遍历数据框列表,2.遍历每个数据框,遍历要更改的列列表,然后将它们重新分配到位。像这样的东西:
for (i in names(datasets)) {
for (j in dates[i]) {
for (k in datasets[[i]][j]) {
k <- as.Date(k)
}
}
}
这很丑,所以我想尝试使用purrr做同样的事情。我认为这是个好主意:
library(purrr)
walk2(datasets, dates, ~ walk(.x[.y], ~ {.x <- as.Date(.x)}))
但是执行此操作后,数据集仍保持不变。为什么?
这里是使用purrr和dplyr的解决方案:
library(purrr)
library(dplyr)
datasets <- datasets %>%
imap(~{
.x %>%
mutate_at(vars(dates[[.y]]), as.Date)
})
str(datasets)
#List of 2
#$ first :'data.frame': 3 obs. of 3 variables:
# ..$ a: Date[1:3], format: "2020-03-02" "2020-12-22" "2020-07-03"
# ..$ b: num [1:3] 4 5 6
# ..$ c: Date[1:3], format: "2020-03-13" "2019-11-03" "2011-05-02"
#$ second:'data.frame': 3 obs. of 3 variables:
# ..$ d: num [1:3] 1 2 3
# ..$ e: Date[1:3], format: "2020-05-21" "2014-08-31" "1999-01-21"
# ..$ f: num [1:3] 7 8 9