我有以下数据帧:
set.seed(1)
df <- data.frame(X1 = sample(c(letters[1:5],NA),10,replace=TRUE),
X2 = sample(c(letters[1:5],NA),10,replace=TRUE),
X3 = sample(c(letters[1:5],NA),10,replace=TRUE),
stringsAsFactors = FALSE)
X1 X2 X3
1 b b <NA>
2 c b b
3 d e d
4 <NA> c a
5 b e b
6 <NA> c c
7 <NA> e a
8 d <NA> c
9 d c <NA>
10 a e c
我想取代a
为5,b
为4,c
为3,d
为2,e
为1,其中:
df %>% lapply(., plyr::mapvalues(, c("a","b","c","d","e"), c(5,4,3,2,1)))
但它不起作用:我得到一个警告,它缺少函数mapvalues()
的第一个参数。有谁知道我做错了什么?
一种简单而直接的方法:
lookup <- 5:1
names(lookup) <- c("a","b","c","d","e")
df[] <- lapply(df, function(x) lookup[x])
df
X1 X2 X3
1 4 4 NA
2 3 4 4
3 2 1 2
4 NA 3 5
5 4 1 4
6 NA 3 3
7 NA 1 5
8 2 NA 3
9 2 3 NA
10 5 1 3
请注意,lookup
是一个简单的命名向量,即
> lookup
a b c d e
5 4 3 2 1
并且df[]
确保您在lapply
上保留数据帧结构。在对lapply
的调用中,每列中的值仅用于在查找表中按名称查找。要突出显示这一点,lookup["c"]
返回值“3”。
使用lapply
的语法略有不同。下面是它的工作原理:
df %>% lapply(plyr::mapvalues, from = c("a","b","c","d","e"), to = c(5,4,3,2,1))
$X1
[1] "1" "3" "3" "1" "1" "2" "4" "5" NA "2"
$X2
[1] "2" "1" NA "3" "1" "5" "3" "2" NA NA
$X3
[1] "3" "3" NA "1" NA "1" "1" "2" NA "2"
如果您之后仍想要数据帧,最好使用apply
而不是lapply
:
df %>% apply(2, plyr::mapvalues, from = c("a","b","c","d","e"), to = c(5,4,3,2,1)) %>%
as.data.frame(stringsAsFactors = F)
X1 X2 X3
1 4 4 <NA>
2 3 4 4
3 2 1 2
4 <NA> 3 5
5 4 1 4
6 <NA> 3 3
7 <NA> 1 5
8 2 <NA> 3
9 2 3 <NA>
10 5 1 3