我确信这有一个简单的答案,但我扫描了堆栈溢出并且无法找到解决方案。似乎可能是sapply和ifelse函数的组合可以完成这项工作(但我不确定)。
所以我有一个包含字符的数据框,除了一列是数值。
####Create dataframe which needs converting
df <- data.frame(Sample_1 = rep(letters[1:3], each = 3),
Sample_2 = rep("a", times = 9))
df$Number <- rep(seq(from=1,to=3,by=1))
我想将此数据框中的字符转换为特定数字。需要转换的字符取决于最后一列中的数字。所以标准是:
这是一个突出显示此转换的数据框
A <- c(30,20,10)
B <- c(35,25,15)
C <- c(40,30,20)
Conversion_df <- data.frame(A, B,C)
这是所需的输出。
Final <- data.frame(Sample_1 = c(30,20,10,35,25,15,40,30,20),
Sample_2 = c(30,20,10,30,20,10,30,20,10))
预先感谢您的任何帮助。
首先,我们可以使用if语句创建一个函数来评估样本:
valuate_sample <- function(x,y) {
ifelse(y==1, ifelse(x=='a',30, ifelse(x=='b',20, 10)),
ifelse(y==2, ifelse(x=='a',35, ifelse(x=='b',25, 15)),
ifelse(y==3, ifelse(x=='a',40, ifelse(x=='b',30, 20)),0)))
}
我们只需要在您的数据框中使用该函数:
df <- df %>%
mutate(
Sample_1 = valuate_sample(Sample_1, Number),
Sample_2 = valuate_sample(Sample_2, Number)
)
结果:
我也有一个dplyr
解决方案,但使用case_when
,这可能更透明一点。这个想法取自这个答案https://stackoverflow.com/a/24459900/5795592
library(dplyr)
df %>% mutate( # Sample_1
Sample_1_conv = case_when( Number == 1 & Sample_1 == "a" ~ 30
, Number == 1 & Sample_1 == "b" ~ 25
, Number == 1 & Sample_1 == "c" ~ 10
, Number == 2 & Sample_1 == "a" ~ 35
, Number == 2 & Sample_1 == "b" ~ 25
, Number == 2 & Sample_1 == "c" ~ 15
, Number == 3 & Sample_1 == "a" ~ 40
, Number == 3 & Sample_1 == "b" ~ 30
, Number == 3 & Sample_1 == "c" ~ 20)
# Sample_2
, Sample_2_conv = case_when( Number == 1 & Sample_2 == "a" ~ 30
, Number == 1 & Sample_2 == "b" ~ 25
, Number == 1 & Sample_2 == "c" ~ 10
, Number == 2 & Sample_2 == "a" ~ 35
, Number == 2 & Sample_2 == "b" ~ 25
, Number == 2 & Sample_2 == "c" ~ 15
, Number == 3 & Sample_2 == "a" ~ 40
, Number == 3 & Sample_2 == "b" ~ 30
, Number == 3 & Sample_2 == "c" ~ 20)
)
根据@skulden在评论中描述的代码,您还可以在所有期望的列(即在数据帧中编码为因子的那些列)中自动应用“valuate_sample”函数。
以下是@skulden在上一个答案中突出显示的功能。
valuate_sample <- function(x,y) {
ifelse(y==1, ifelse(x=='a',30, ifelse(x=='b',20, 10)),
ifelse(y==2, ifelse(x=='a',35, ifelse(x=='b',25, 15)),
ifelse(y==3, ifelse(x=='a',40, ifelse(x=='b',30, 20)),0)))
}
以下是如何将其应用于所有列。
for(column in names(df)) { if(is.factor(df[,column])){
df[,column] <- valuate_sample(df[,column], df[,'Number'])
}