在R中:有没有办法标记表中每个特定组内的重叠日期范围? (即按患者ID)

问题描述 投票:0回答:1

我有一组入院和出院日期,并按患者ID细分。每个ID有多个日期范围,其中一些重叠。我试图找到一种方法来标记哪些行包含重叠的日期,以便在计算“住院时间”时不会重复计算。

到目前为止,我已经创建了一个时间间隔变量(放电日期-准入日期),并使用int_overlaps标记存在重叠的行。这样做还行,但是除了标记重叠之外,还标记连续的停留。

即我要标记:

停留A:2001-10-03 / 2001-10-06

逗留B:2001-10-04 / 2001-10-11

但是我不想举报:

停留A:2001-10-03 / 2001-10-06

逗留B:2001-10-06 / 2001-10-11

我使用的代码是从该网站上其他位置的答案中复制的,我对它的理解不足够,无法以正确的方式对其进行修改(我几乎是R ...的新手!]

这是df和代码的简化示例。...如果有人可以建议我如何进行更改以停止标记连续的停留,我将不胜感激!

ID <- c(1, 1, 2, 3, 3, 3, 4, 5, 5, 5, 5)
admdate <- c("2001-10-03", "2001-10-05", "2003-10-04", "2006-02-03", "2006-05-27", "2006-07-01", "2001-08-02", "2008-10-11", "2008-11-01", "2009-01-09", "2009-02-18")
dischdate <- c("2001-10-05", "2001-12-08", "2003-10-04", "2006-05-29", "2006-06-01", "2006-07-07", "2001-08-11", "2008-10-14", "2009-01-13", "2009-01-21", "2009-02-26")

HospAdms <- cbind(ID, admdate, dischdate)
HospAdms <- data.frame(ID, admdate, dischdate)

as_date(HospAdms$admdate)
as_date(HospAdms$dischdate)

HospAdms$Int <- interval(start=HospAdms$admdate, end=HospAdms$dischdate)

HospAdms$overlap <- unlist(tapply(HospAdms$Int,
                                 HospAdms$ID,
                                 function(x) rowSums(outer(x,x,int_overlaps))>1))

在此示例代码生成的df中,前两行是连续的停留,但是它们被标记了,我不希望这样。希望有道理!

r
1个回答
0
投票

这是否回答了您的问题?

library(data.table)
admissions <- data.table(
  ID = c(1, 1, 2, 3, 3, 3, 4, 5, 5, 5, 5),
  admdate = c("2001-10-03", "2001-10-05", "2003-10-04", "2006-02-03", "2006-05-27", "2006-07-01", "2001-08-02", "2008-10-11", "2008-11-01", "2009-01-09", "2009-02-18"),
  dischdate = c("2001-10-05", "2001-12-08", "2003-10-04", "2006-05-29", "2006-06-01", "2006-07-07", "2001-08-11", "2008-10-14", "2009-01-13", "2009-01-21", "2009-02-26")
  )

# Non equi joins are only possible with numeric fields
admissions[,c('start','end'):=.(as.POSIXct(admdate),
                                as.POSIXct(dischdate))]

admissions[admissions, on = .(start<start,end>start ),nomatch = NULL]

0
投票

data.table方法将为您提供每个ID的总停留时间,并计入差距和重叠:

library(data.table)

setDT(HospAdms)[, .(dates = seq.Date(admdate, dischdate, 'day')) , by = .(ID, 1:nrow(HospAdms))
  ][, .(LOS = uniqueN(dates)), by = ID][]

输出

   ID LOS
1:  1  67
2:  2   1
3:  3 126
4:  4  10
5:  5  95
© www.soinside.com 2019 - 2024. All rights reserved.