我正在寻求帮助,解决我目前卡在一个项目中的问题(文后有reprex)。
基本上,我想做的是根据病人一周记录数据的次数,用水平的标准填充一个变量,以探索记录的质量。
级别的标准如下。
3+ readings/week == "4",
3 readings/week == "3",
2 readings/week == "2",
1 reading/week == "1",
NA == "0"
我首先通过使用lubridate的week()函数为周创建了一个新的变量,它根据日期在一年中的位置给我提供了周数。理想情况下,我希望按升序(1-n)分配周号,从患者记录的第一个日期到最后一个日期的1开始。
一直在考虑使用for循环,但目前使用case_when。我目前遇到的问题是设置条件,检查每个病人id的水平频率,然后分配标准。
任何帮助都将是有益的,因为我今天大部分时间都卡在这个问题上,非常感谢(转载如下)。
library(lubridate)
#> Warning: package 'lubridate' was built under R version 3.5.3
#>
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#>
#> date
library(tidyverse)
#> Warning: package 'tidyverse' was built under R version 3.5.3
#> Warning: package 'ggplot2' was built under R version 3.5.3
#> Warning: package 'tibble' was built under R version 3.5.3
#> Warning: package 'tidyr' was built under R version 3.5.3
#> Warning: package 'readr' was built under R version 3.5.3
#> Warning: package 'purrr' was built under R version 3.5.3
#> Warning: package 'dplyr' was built under R version 3.5.3
#> Warning: package 'stringr' was built under R version 3.5.3
#> Warning: package 'forcats' was built under R version 3.5.3
##Variables##
patientid <- c("-2147483646", "-2147483646", "-2147483646", "-2147483646", "-2147483646", "-2147483646", "-2147483646", "-2147483646",
"-2147483646", "-2147483646", "-2147483646", "-2147483646", "-2147483646", "-2147483646", "-2147483646", "-2147483646")
date <- c("2018-08-06", "2018-08-07", "2018-08-07", "2018-08-07", "2018-08-15", "2018-08-15", "2018-08-15", "2018-08-20", "2018-08-20",
"2018-08-20", "2018-08-27", "2018-08-27", "2018-08-27", "2018-09-03", "2018-09-03", "2018-09-03")
week <- week(date)
adherence <- ""
test.df <- data.frame(patientid, date, week, adherence) #test df with variables above
##Dataframe and attempt##
table(test.df$week) #See frequency of each
#>
#> 32 33 34 35 36
#> 4 3 3 3 3
test.df <- test.df %>% #Dataframe
mutate(
patientid = as.factor(patientid),
date = as.Date(date),
week = as.factor(week))
adherence <- test.df %>% #Attempt to create if/else/else if loop to populate adherence
mutate(week =
if(count(week) > 3){adherence == "4"})
#> Error in UseMethod("summarise_"): no applicable method for 'summarise_' applied to an object of class "factor"
我想这是你想要的一个版本。
test.df <- data.frame(patientid, date) #test df with variables above
test.df %>% #Dataframe
mutate(
patientid = as.factor(patientid),
date = as.Date(date),
week = floor_date(date, "week")
) %>%
group_by(patientid, week) %>%
summarize(total_readings = n(),
adherence = case_when(is.na(total_readings) ~ 0L,
total_readings < 4 ~ total_readings,
total_readings >= 4 ~ 4L,
TRUE ~ NA_integer_))
唯一没有处理好的就是周的排序。