基于数据帧中的另一个字符向量在向量中进行条件替换

Question

我有一个带有“变异”列的数据框。它们可以是SNP，如“C> A”，插入像“+ TTTAAG”或删除像“-CTTGA”。例如：

**position** **mutation**
1234           C > A
1452           +TTTAAG
2734           -CTTGA

我希望R在变异列（“>”，“+”或“ - ”）中搜索特定字符，并将“SNP”，“插入”或“删除”分别写入数据帧中的新列，所以我希望以下结果：

**position** **mutation**  **mutation_type**
1234           C > A             SNP
1452           +TTTAAG         insertion
2734           -CTTGA           deletion

我试着做以下事情：

mutation_type <- rep(NA, length(df$position)))
df$mutation_type <- mutation_type #creating a new column with NAs

试：

while(grep(pattern = "-", df$mutation)){
  df$mutation_type <- "deletion"
}

只需覆盖mutation_type列中的每个单元格。你能给我一个如何解决这个问题的建议吗？

Answer 1

使用grep和ifelse的解决方案：

genotype <- data.frame(position = 1:3,
                       mutation = c("C > A", "+TGCA", "-ACGT"))
genotype$mutation_type <- 
    ifelse(grepl("\\+", genotype$mutation), "Insertion", 
           ifelse(grepl("\\-", genotype$mutation), "Deletion", "SNP"))

  position mutation mutation_type
1        1    C > A           SNP
2        2    +TGCA     Insertion
3        3    -ACGT      Deletion

基于数据帧中的另一个字符向量在向量中进行条件替换

问题描述投票：0回答：1

1个回答

最新问题

基于数据帧中的另一个字符向量在向量中进行条件替换

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1