计算组内差异R.

问题描述 投票:4回答:3

我试图找出如何使用名为reading的列附加一个列,该列标识给定日期的不同ID之间是否存在10的差异。

**Day   ID  Reading**
19-Jan  1   10
19-Jan  1   10
19-Jan  1   10
19-Jan  1   20
19-Jan  2   20
19-Jan  2   20
19-Jan  2   20
19-Jan  2   20
20-Jan  1   10
21-Jan  1   10
22-Jan  1   10
23-Jan  1   10
24-Jan  1   20
25-Jan  2   20
25-Jan  2   20
25-Jan  2   20
25-Jan  2   10

我想要:

**Day   ID  Reading Difference**
19-Jan  1   10  Y
19-Jan  1   10  Y
19-Jan  1   10  Y
19-Jan  1   20  Y
19-Jan  2   20  N
19-Jan  2   20  N
19-Jan  2   20  N
19-Jan  2   20  N
20-Jan  1   10  N
21-Jan  1   10  N
22-Jan  1   10  N
23-Jan  1   10  N
24-Jan  1   20  N
25-Jan  2   20  Y
25-Jan  2   20  Y
25-Jan  2   20  Y
25-Jan  2   10  Y
r
3个回答
4
投票

您可以做的是检查每组的diffrangeerence是否等于或大于10。

dat$Diff <- with(dat, ave(Reading, Day, ID, FUN = function(x) diff(range(x)) >= 10))
dat
#      Day ID Reading Diff
#1  19-Jan  1      10    1
#2  19-Jan  1      10    1
#3  19-Jan  1      10    1
#4  19-Jan  1      20    1
#5  19-Jan  2      20    0
#6  19-Jan  2      20    0
#7  19-Jan  2      20    0
#8  19-Jan  2      20    0
#9  20-Jan  1      10    0
#10 21-Jan  1      10    0
#11 22-Jan  1      10    0
#12 23-Jan  1      10    0
#13 24-Jan  1      20    0
#14 25-Jan  2      20    1
#15 25-Jan  2      20    1
#16 25-Jan  2      20    1
#17 25-Jan  2      10    1

数据

dat <- structure(list(Day = c("19-Jan", "19-Jan", "19-Jan", "19-Jan", 
"19-Jan", "19-Jan", "19-Jan", "19-Jan", "20-Jan", "21-Jan", "22-Jan", 
"23-Jan", "24-Jan", "25-Jan", "25-Jan", "25-Jan", "25-Jan"), 
    ID = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 
    2L, 2L, 2L, 2L), Reading = c(10L, 10L, 10L, 20L, 20L, 20L, 
    20L, 20L, 10L, 10L, 10L, 10L, 20L, 20L, 20L, 20L, 10L)), .Names = c("Day", 
"ID", "Reading"), class = "data.frame", row.names = c(NA, -17L
))

2
投票

我们可以使用data.table

library(data.table)
setDT(df1)[, Difference := abs(Reduce(`-`, as.list(range(Reading)))) >= 10, 
          .(ID, Day)]
df1
#       Day ID Reading Difference
# 1: 19-Jan  1      10       TRUE
# 2: 19-Jan  1      10       TRUE
# 3: 19-Jan  1      10       TRUE
# 4: 19-Jan  1      20       TRUE
# 5: 19-Jan  2      20      FALSE
# 6: 19-Jan  2      20      FALSE
# 7: 19-Jan  2      20      FALSE
# 8: 19-Jan  2      20      FALSE
# 9: 20-Jan  1      10      FALSE
#10: 21-Jan  1      10      FALSE
#11: 22-Jan  1      10      FALSE
#12: 23-Jan  1      10      FALSE
#13: 24-Jan  1      20      FALSE
#14: 25-Jan  2      20       TRUE
#15: 25-Jan  2      20       TRUE
#16: 25-Jan  2      20       TRUE
#17: 25-Jan  2      10       TRUE

data

df1 <- structure(list(Day = c("19-Jan", "19-Jan", "19-Jan", "19-Jan", 
  "19-Jan", "19-Jan", "19-Jan", "19-Jan", "20-Jan", "21-Jan", "22-Jan", 
 "23-Jan", "24-Jan", "25-Jan", "25-Jan", "25-Jan", "25-Jan"), 
 ID = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 
 2L, 2L, 2L, 2L), Reading = c(10L, 10L, 10L, 20L, 20L, 20L, 
 20L, 20L, 10L, 10L, 10L, 10L, 20L, 20L, 20L, 20L, 10L)),
  class = "data.frame", row.names = c(NA, -17L))

2
投票

使用tidyverse你可以做类似的事情

library(tidyverse)

your_data %>%
  group_by(Day, ID) %>%
  mutate(difference = (max(difference) - min(difference)) >= 10)
© www.soinside.com 2019 - 2024. All rights reserved.