根据后续变量更改变量

问题描述 投票:0回答:1

我想重新格式化我的数据,以便如果一个人在呈阳性后又进行了阴性测试,则该阳性测试将更改为阴性(被视为假阳性)。

在我的数据集中,阳性测试在测试栏中用模棱两可、不确定或阳性表示。

library(tidyverse)
library(data.table)

date=c("2023-01-01", "2023-02-07", "2023-02-20","2023-01-01", "2023-02-07", "2023-02-20", "2023-01-01", "2023-05-10", "2023-01-01", "2023-01-01", "2023-01-01", "2023-01-01", "2023-01-01", "2023-01-01", "2023-01-10", "2023-01-01", "2023-01-10")
ID=c("A", "A", "A","A2", "A2", "A2", "B", "B", "C", "D", "D", "D1", "D1", "E", "E", "F", "F")
test=c("negative", "equivocal", "negative", "negative", "indeterminate", "negative", "negative", "negative", "positive", "positive", "negative","indeterminate", "negative", "positive", "negative", "negative", "positive")
df=as.data.table(cbind(date, ID, test))  
df[, date := as.Date(date)] 

因此,以下突出显示的测试将全部转换为阴性,因为在同一天或在阳性测试之后有阴性测试。

enter image description here

r data.table
1个回答
0
投票
df[, test2 := df[.SD, on = .(ID, date >= date), if (any(test == "negative")) "negative" else test, by = .EACHI]$V1] # date ID test test2 # <Date> <char> <char> <char> # 1: 2023-01-01 A negative negative # 2: 2023-02-07 A equivocal negative # 3: 2023-02-20 A negative negative # 4: 2023-01-01 A2 negative negative # 5: 2023-02-07 A2 indeterminate negative # 6: 2023-02-20 A2 negative negative # 7: 2023-01-01 B negative negative # 8: 2023-05-10 B negative negative # 9: 2023-01-01 C positive positive # 10: 2023-01-01 D positive negative # 11: 2023-01-01 D negative negative # 12: 2023-01-01 D1 indeterminate negative # 13: 2023-01-01 D1 negative negative # 14: 2023-01-01 E positive negative # 15: 2023-01-10 E negative negative # 16: 2023-01-01 F negative negative # 17: 2023-01-10 F positive positive
    
© www.soinside.com 2019 - 2024. All rights reserved.