如何在包含变量的字符串中添加自定义字符取决于组变量?
这是虚拟输入数据集:
library(tidyverse)
unit <- c(50, 50, 40, 40, 30, 30, 20, 20, 10, 10)
id <- c("A100", "A101", "A102", "A103", "A100", "A101", "A102", "A103", "A101", "A100")
variation <- c("aaa1", "aaa1", "bbb1", "aaa2", "b1","a3", "a1", "b1", "a1", "b1" )
result <- c("Way1", "Way1", "Way2", "Way2", "Way3","Way1", "Way2", "Way3", "Way4", "Way1" )
data <- data.frame(id, variation, result, unit)
head(data)
# id variation result unit
# 1 A100 aaa1 Way1 50
# 2 A101 aaa1 Way1 50
# 3 A102 bbb1 Way2 40
# 4 A103 aaa2 Way2 40
# 5 A100 b1 Way3 30
# 6 A101 a3 Way1 30
# 7 A102 a1 Way2 20
# 8 A103 b1 Way3 20
# 9 A101 a1 Way4 10
# 10 A100 b1 Way1 10
是否可以根据“单位”列在“变体”列中添加自定义字符串字符?
这是预期的输出::
# id variation result unit
# 1 A100 A1.aaa1 Way1 50
# 2 A101 A1.aaa1 Way1 50
# 3 A102 A2.bbb1 Way2 40
# 4 A103 A2.aaa2 Way2 40
# 5 A100 A3.b1 Way3 30
# 6 A101 A3.a3 Way1 30
# 7 A102 A4.a1 Way2 20
# 8 A103 A4.b1 Way3 20
# 9 A101 A5.a1 Way4 10
# 10 A100 A5.b1 Way1 10
如您所见,如果“unit”变量相同,则相同的自定义字符串会添加到该“variation”变量中。
首选 dplyr 和基本 R 函数。
data |> mutate(variation = paste0("A", cumsum(unit != lag(unit, default = first(unit))) + 1,".", variation))
输出:
id variation result unit
1 A100 A1.aaa1 Way1 50
2 A101 A1.aaa1 Way1 50
3 A102 A2.bbb1 Way2 40
4 A103 A2.aaa2 Way2 40
5 A100 A3.b1 Way3 30
6 A101 A3.a3 Way1 30
7 A102 A4.a1 Way2 20
8 A103 A4.b1 Way3 20
9 A101 A5.a1 Way4 10
10 A100 A5.b1 Way1 10