让我们考虑一个具有“之前”列作为字符格式的数据框。
如何转换检测限,例如“<0.5" to a number with four decimals "0.4999"and the limits of linearity like e.g. ">15.0”到“15.0001”,同时保持字符串文本完整,例如“价值>极限”?
编辑
请注意,我的数据框包含数千行,包括几十个不同的检测限制、线性限制和字符串文本;因此,全局格式化将是更好的选择,而不必在要执行的脚本中一一搜索并键入不同的限制/字符串。
dat0 <-
structure(list(before = c("6.1", "<0.5", "4.7", ">15.0", "Value > limit",
"8.0", "Result < cutoff", "6.5", "<50", "92", ">500", "480",
"Value > linearity"), after = c("6.1", "0.4999", "4.7", "15.0001",
"Value > limit", "8.0", "Result < cutoff", "6.5", "49.9999",
"92", "500.0001", "480", "Value > linearity")), class = "data.frame", row.names = c(NA,
-13L))
感谢您的帮助
一个基本 R 选项是使用
lapply
和 grepl
来测试 < or >:
dat0[] <- lapply(dat0[], \(x) {
x[grepl("<\\d+", x)] <- "0.4449"
x[grepl(">\\d+", x)] <- "15.0001"
x})
# before after
# 1 6.1 6.1
# 2 0.4449 0.4999
# 3 4.7 4.7
# 4 15.0001 15.0001
# 5 Value > limit Value > limit
# 6 8.0 8.0
# 7 Result < cutoff Result < cutoff
# 8 6.5 6.5
# 9 0.4449 49.9999
# 10 92 92
# 11 15.0001 500.0001
# 12 480 480
# 13 Value > linearity Value > linearity